flashinfer.logits_processor.Softmax¶

class flashinfer.logits_processor.Softmax(enable_pdl: bool | None = None, **params: Any)¶

Softmax processor to convert logits to probabilities.

Applies the softmax function.

TensorType.LOGITS -> TensorType.PROBS

Parameters:: enable_pdl (bool, optional, Compile-time) – Whether to enable PDL for the kernel implementation. Default is True.

Examples

>>> import torch
>>> from flashinfer.logits_processor import LogitsPipe, Softmax, Sample
>>> torch.manual_seed(42)
>>> pipe = LogitsPipe([Softmax()])
>>> logits = torch.randn(2, 2, device="cuda")
>>> logits
tensor([[ 0.1940,  2.1614], [ -0.1721,  0.8491]], device='cuda:0')
>>> probs = pipe(logits)
>>> probs
tensor([[0.1227, 0.8773], [0.2648, 0.7352]], device='cuda:0')

Notes

Can only appear once in a pipeline.

__init__(enable_pdl: bool | None = None, **params: Any)¶

Constructor for Softmax processor.

Parameters:: enable_pdl (bool, optional, Compile-time) – Whether to enable PDL for the kernel implementation. Default is None, which means the kernel will be automatically enabled if PDL is supported on the device.

Methods

`__init__`([enable_pdl])	Constructor for Softmax processor.
`legalize`(input_type)	Legalize the processor into a list of low-level operators.