flashinfer.rope#

Kernels for applying rotary embeddings.

apply_rope_inplace(q, k, indptr, offsets[, ...])

Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace.

apply_llama31_rope_inplace(q, k, indptr, offsets)

Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace.

apply_rope(q, k, indptr, offsets[, ...])

Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor).

apply_llama31_rope(q, k, indptr, offsets[, ...])

Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor).