flashinfer.comm.vllm_init_custom_ar¶
- flashinfer.comm.vllm_init_custom_ar(ipc_tensors: List[int], rank_data: Tensor, rank: int, full_nvlink: bool) int¶
Initialize the vLLM custom all-reduce backend.
- Parameters:
ipc_tensors (list[int]) – IPC pointers to the per-rank communication buffers.
rank_data (torch.Tensor) – Scratch tensor (one per rank) used for metadata exchange.
rank (int) – Current rank within the all-reduce world.
full_nvlink (bool) –
Truewhen every pair of ranks is connected via NVLink (enables the fully-NVLink-optimized kernel path).
- Returns:
Opaque handle (
fa) to be passed to subsequentvllm_arcalls.- Return type:
int