flashinfer.comm.vllm_init_custom_ar¶

flashinfer.comm.vllm_init_custom_ar(ipc_tensors: List[int], rank_data: Tensor, rank: int, full_nvlink: bool) → int¶

Initialize the vLLM custom all-reduce backend.

Parameters:

ipc_tensors (list[int]) – IPC pointers to the per-rank communication buffers.
rank_data (torch.Tensor) – Scratch tensor (one per rank) used for metadata exchange.
rank (int) – Current rank within the all-reduce world.
full_nvlink (bool) – True when every pair of ranks is connected via NVLink (enables the fully-NVLink-optimized kernel path).

Returns:

Opaque handle (fa) to be passed to subsequent vllm_ar calls.

Return type:

int