flashinfer.gdn_decode¶
Gated Delta-Rule decode-side kernels used in Mamba-2 / GDN-style sequence models. These functions consume a pre-built KV / state cache and run the recurrent gated delta-rule update for the current decode step.
|
Gated Delta Rule Decode kernel (K-major layout, no transpose needed). |
|
Gated Delta Rule Decode kernel for single-token generation. |
|
Gated Delta Rule MTP kernel (Multiple Token Processing). |