flashinfer.quantization.e2m1_and_ufp8sf_scale_to_float¶

flashinfer.quantization.e2m1_and_ufp8sf_scale_to_float(e2m1_tensor: Tensor, ufp8_scale_tensor: Tensor, global_scale_tensor: Tensor | None = None, sf_vec_size: int = 16, ufp8_type: int = 1, is_sf_swizzled_layout: bool = True) → Tensor¶

Dequantize an E2M1 tensor with UFP8 scales back to float32.

Performs dequantization by converting a packed FP4 tensor in E2M1 format back to float values using the associated UFP8 scale factors and global scale.

Parameters:

e2m1_tensor (torch.Tensor) – Packed FP4 tensor in E2M1 format of shape [M, K/2] with dtype uint8.
ufp8_scale_tensor (torch.Tensor) – Scale-factor tensor in UFP8 format with dtype uint8.
global_scale_tensor (torch.Tensor, optional) – Global scale factor of shape [1] and dtype float32.
sf_vec_size (int) – Scale-factor vector size. Defaults to 16.
ufp8_type (int) – UFP8 scale-factor type (0 for UE8M0, 1 for E4M3). Defaults to 1.
is_sf_swizzled_layout (bool) – Whether the scale factors are stored in the swizzled layout. Defaults to True.

Returns:

Dequantized float tensor of shape [M, K] with dtype float32.

Return type:

torch.Tensor