flashinfer.testing¶
This module provides comprehensive testing utilities for benchmarking, performance analysis in FlashInfer.
Test Environment Setup¶
  | 
Set random seed for reproducibility during testing.  | 
  | 
Sleep after kernel run.  | 
Performance Analysis¶
FLOPS Calculation¶
  | 
Calculate FLOPs for a given attention layer.  | 
Calculate FLOPs for a given attention layer with actual sequence lengths where actual sequence lengths are provided as 1D tensors.  | 
|
  | 
Calculate TFLOPS per second for a given attention layer.  | 
Calculate TFLOPS per second for a given attention layer with actual sequence lengths.  | 
Throughput Analysis¶
  | 
Calculate TB per second perf achieved for a given attention layer.  | 
Calculate TB per second perf achieved for a given attention layer with actual sequence lengths.  | 
GPU Benchmarking¶
  | 
Benchmark wrapper that chooses among CUPTI, CUDA events, or CUDA Graphs.  | 
  | 
Benchmark kernel execution time using CUDA events (no CUDA graphs).  | 
  | 
Benchmark GPU time using by constructing CUDA graphs with kernel launch and then replaying the graph.  | 
  | 
Benchmark GPU time using CUPTI activity tracing to measure kernel execution time.  |