.. _apicudnn: flashinfer.cudnn ================ cuDNN-backed attention kernels. These wrappers call into NVIDIA's cuDNN runtime for batch prefill and batch decode, and are typically used as an alternative backend for ``BatchPrefillWithPagedKVCacheWrapper`` / ``BatchDecodeWithPagedKVCacheWrapper`` when cuDNN is available on the host GPU. .. currentmodule:: flashinfer.cudnn .. autosummary:: :toctree: ../generated cudnn_batch_decode_with_kv_cache cudnn_batch_prefill_with_kv_cache