flashinfer.testing¶
This module provides comprehensive testing utilities for benchmarking, performance analysis in FlashInfer.
Test Environment Setup¶
|
Set random seed for reproducibility during testing. |
|
Sleep after kernel run. |
Performance Analysis¶
FLOPS Calculation¶
|
Calculate FLOPs for a given attention layer. |
Calculate FLOPs for a given attention layer with actual sequence lengths where actual sequence lengths are provided as 1D tensors. |
|
|
Calculate TFLOPS per second for a given attention layer. |
Calculate TFLOPS per second for a given attention layer with actual sequence lengths. |
Throughput Analysis¶
|
Calculate TB per second perf achieved for a given attention layer. |
Calculate TB per second perf achieved for a given attention layer with actual sequence lengths. |
GPU Benchmarking¶
|
Unified GPU benchmarking interface with configurable timing backends. |
|
Benchmark kernel execution time using CUDA events (no CUDA graphs). |
|
Benchmark GPU time using CUDA graphs with amortized kernel launch overhead. |
|
Benchmark GPU time using CUPTI activity tracing for precise kernel timing. |