flashinfer.rope¶
Kernels for applying rotary embeddings.
|
Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace. |
|
Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace. |
|
Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor). |
|
Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor). |
|
Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor). |
|
Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace. |
|
Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor). |
|
Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace. |
|
Apply rotary embedding to keys and queries with precomputed cos/sin values. |
|
Apply rotary embedding to keys and queries with precomputed cos/sin values. |
|
Apply RoPE (Rotary Positional Embeddings) and quantize to FP8 format. |
Apply RoPE (Rotary Positional Embeddings), quantize to FP8, and append K/V to paged cache. |
|
|
Apply RoPE and quantize to FP8 for MLA attention. |