Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
FlashInfer 0.2.5 documentation
Light Logo Dark Logo
FlashInfer 0.2.5 documentation

Get Started

  • Installation

Tutorials

  • Attention States and Recursive Attention
  • KV-Cache Layout in FlashInfer

PyTorch API Reference

  • flashinfer.decode
    • flashinfer.decode.single_decode_with_kv_cache
  • flashinfer.prefill
    • flashinfer.prefill.single_prefill_with_kv_cache
    • flashinfer.prefill.single_prefill_with_kv_cache_return_lse
  • flashinfer.cascade
    • flashinfer.cascade.merge_state
    • flashinfer.cascade.merge_state_in_place
    • flashinfer.cascade.merge_states
  • flashinfer.mla
  • flashinfer.sparse
  • flashinfer.page
    • flashinfer.page.append_paged_kv_cache
    • flashinfer.page.append_paged_mla_kv_cache
    • flashinfer.page.get_batch_indices_positions
  • flashinfer.sampling
    • flashinfer.sampling.sampling_from_probs
    • flashinfer.sampling.top_p_sampling_from_probs
    • flashinfer.sampling.top_k_sampling_from_probs
    • flashinfer.sampling.min_p_sampling_from_probs
    • flashinfer.sampling.top_k_top_p_sampling_from_logits
    • flashinfer.sampling.top_k_top_p_sampling_from_probs
    • flashinfer.sampling.top_p_renorm_probs
    • flashinfer.sampling.top_k_renorm_probs
    • flashinfer.sampling.top_k_mask_logits
    • flashinfer.sampling.chain_speculative_sampling
  • flashinfer.gemm
    • flashinfer.gemm.gemm_fp8_nt_groupwise
    • flashinfer.gemm.group_gemm_fp8_nt_groupwise
    • flashinfer.gemm.bmm_fp8
  • flashinfer.norm
    • flashinfer.norm.rmsnorm
    • flashinfer.norm.fused_add_rmsnorm
    • flashinfer.norm.gemma_rmsnorm
    • flashinfer.norm.gemma_fused_add_rmsnorm
  • flashinfer.rope
    • flashinfer.rope.apply_rope_inplace
    • flashinfer.rope.apply_llama31_rope_inplace
    • flashinfer.rope.apply_rope
    • flashinfer.rope.apply_llama31_rope
    • flashinfer.rope.apply_rope_pos_ids
    • flashinfer.rope.apply_rope_pos_ids_inplace
    • flashinfer.rope.apply_llama31_rope_pos_ids
    • flashinfer.rope.apply_llama31_rope_pos_ids_inplace
    • flashinfer.rope.apply_rope_with_cos_sin_cache
    • flashinfer.rope.apply_rope_with_cos_sin_cache_inplace
  • flashinfer.activation
    • flashinfer.activation.silu_and_mul
    • flashinfer.activation.gelu_tanh_and_mul
    • flashinfer.activation.gelu_and_mul
  • flashinfer.quantization
    • flashinfer.quantization.packbits
    • flashinfer.quantization.segment_packbits
Back to top
Copyright © 2023-2024, FlashInfer Contributors
Made with Sphinx and @pradyunsg's Furo