flashinfer.fused_moe.reorder_rows_for_gated_act_gemm

flashinfer.fused_moe.reorder_rows_for_gated_act_gemm(x: Tensor) Tensor

Reorder rows of a weight tensor for the TensorRT-LLM gated-activation GEMM layout.

Pure-PyTorch reimplementation of the TensorRT-LLM reorderRowsForGatedActGemm helper. Used to pre-permute the up/gate weight matrix so that the fused gated-activation kernels can access the two halves with a single contiguous load.

Parameters:

x (torch.Tensor) – Weight tensor whose rows will be permuted. Any dtype is accepted; only the row dimension is reordered.

Returns:

Row-permuted copy of x (materialized as a new contiguous tensor; PyTorch advanced indexing always copies, never aliases).

Return type:

torch.Tensor