flashinfer.rope#

Kernels for applying rotary embeddings.

`apply_rope_inplace`(q, k, indptr, offsets[, ...])	Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace.
`apply_llama31_rope_inplace`(q, k, indptr, offsets)	Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace.
`apply_rope`(q, k, indptr, offsets[, ...])	Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor).
`apply_llama31_rope`(q, k, indptr, offsets[, ...])	Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor).
`apply_rope_pos_ids`(q, k, pos_ids[, ...])	Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor).
`apply_rope_pos_ids_inplace`(q, k, pos_ids[, ...])	Apply rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace.
`apply_llama31_rope_pos_ids`(q, k, pos_ids[, ...])	Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor).
`apply_llama31_rope_pos_ids_inplace`(q, k, pos_ids)	Apply Llama 3.1 style rotary embedding to a batch of queries/keys (stored as RaggedTensor) inplace.
`apply_rope_with_cos_sin_cache`(q, k, ...[, ...])	Apply rotary embedding to keys and queries with precomputed cos/sin values.
`apply_rope_with_cos_sin_cache_inplace`(q, k, ...)	Apply rotary embedding to keys and queries with precomputed cos/sin values.