flashinfer.topk

Efficient Top-K selection kernels.

Top-K Selection

top_k(input, k[, sorted])

Radix-based Top-K selection.

top_k_page_table_transform(input, ...[, ...])

Fused Top-K selection + Page Table Transform for sparse attention.

top_k_ragged_transform(input, offsets, ...)

Fused Top-K selection + Ragged Index Transform for sparse attention.

Utility Functions

topk.can_implement_filtered_topk()

Check if the GPU supports enough shared memory for FilteredTopK algorithm.