flashinfer.fp4_quantization.nvfp4_block_scale_interleave¶

flashinfer.fp4_quantization.nvfp4_block_scale_interleave(unswizzled_sf: Tensor) → Tensor¶

Swizzle block scale tensor for FP4 format.

This function swizzles the block scale tensor to optimize memory access patterns for FP4 operations. The output needs to be padded in the m dimension to be a multiple of 128.

Parameters:: unswizzled_sf (torch.Tensor) – Input tensor with dtype uint8 or bfloat16.
Returns:: Swizzled tensor with the same shape as input.
Return type:: torch.Tensor
Raises:: AssertionError – If input dtype is not uint8 or bfloat16.