flashinfer.testing¶
This module provides comprehensive testing utilities for benchmarking, performance analysis in FlashInfer.
Test Environment Setup¶
|
Set random seed for reproducibility during testing. |
|
Sleep after kernel run. |
Performance Analysis¶
FLOPS Calculation¶
|
Calculate FLOPs for a given attention layer. |
Calculate FLOPs for a given attention layer with actual sequence lengths where actual sequence lengths are provided as 1D tensors. |
|
|
Calculate TFLOPS per second for a given attention layer. |
Calculate TFLOPS per second for a given attention layer with actual sequence lengths. |
Throughput Analysis¶
|
Calculate TB per second perf achieved for a given attention layer. |
Calculate TB per second perf achieved for a given attention layer with actual sequence lengths. |
GPU Benchmarking¶
|
Benchmark kernel execution time without using CUDA graphs. |
|
Benchmark GPU time using by constructing CUDA graphs with kernel launch and then replaying the graph. |