flashinfer.fp4_quantization¶
Note
Starting in FlashInfer 0.6.12, the canonical home for FP4 quantization APIs is flashinfer.quantization.
flashinfer.fp4_quantization remains as a backwards-compatibility
shim that re-exports the same symbols, so existing code such as
from flashinfer.fp4_quantization import fp4_quantize keeps
working. New code should import from
flashinfer.quantization.fp4_quantization (or its canonical
re-export at flashinfer.quantization).
This page intentionally does not re-document the FP4 symbols, because
each symbol is the same Python object as the one rendered on
flashinfer.quantization — duplicating the autosummary entries here would
make Sphinx emit “duplicate object description” warnings under
sphinx -W.
See Also¶
flashinfer.quantization — canonical FP4 / FP8 / packbits API reference, including all of the following symbols that
flashinfer.fp4_quantizationused to host:flashinfer.quantization.block_scale_interleave()(alias:nvfp4_block_scale_interleave)