flashinfer.logits_processor.Softmax

class flashinfer.logits_processor.Softmax(enable_pdl: bool | None = None, **params: Any)

Softmax processor to convert logits to probabilities.

Applies the softmax function.

TensorType.LOGITS -> TensorType.PROBS

Parameters:

enable_pdl (bool, optional, Compile-time) – Whether to enable PDL for the kernel implementation. Default is True.

Examples

>>> import torch
>>> from flashinfer.logits_processor import LogitsPipe, Softmax, Sample
>>> torch.manual_seed(42)
>>> pipe = LogitsPipe([Softmax()])
>>> logits = torch.randn(2, 2, device="cuda")
>>> logits
tensor([[ 0.1940,  2.1614], [ -0.1721,  0.8491]], device='cuda:0')
>>> probs = pipe(logits)
>>> probs
tensor([[0.1227, 0.8773], [0.2648, 0.7352]], device='cuda:0')

Notes

Can only appear once in a pipeline.

__init__(enable_pdl: bool | None = None, **params: Any)

Constructor for Softmax processor.

Parameters:

enable_pdl (bool, optional, Compile-time) – Whether to enable PDL for the kernel implementation. Default is None, which means the kernel will be automatically enabled if PDL is supported on the device.

Methods

__init__([enable_pdl])

Constructor for Softmax processor.

legalize(input_type)

Legalize the processor into a list of low-level operators.