[MX] Support mixed MXFP4/FP6/FP8 linear layer #1666

balancap · 2025-02-05T13:50:12Z

Blackwell hardware natively supports any combination of MXFP4/FP6/FP8 in matmuls. See PTX and Cutlass documentation:

According the MX paper, and macore generally the large quantization literature, there is advantages to use different bitwidth for weights, activations and gradients. It would be very useful in mx_mm and MXLinear to support this more general setting.

The text was updated successfully, but these errors were encountered:

balancap · 2025-02-05T14:25:22Z

#1667 is adding mixed element dtypes support to mx_mm. A similar general interface could be added to MXLinear

class MXLinear(torch.nn.Linear):
    @classmethod
    @torch.no_grad()
    def from_float(cls, mod, in_elem_dtype, w_elem_dtype, grad_elem_dtype, block_size):
        ...

balancap mentioned this issue Feb 5, 2025

Support mixed MX element dtype in mx_mm function and MXLinear. #1667

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MX] Support mixed MXFP4/FP6/FP8 linear layer #1666

[MX] Support mixed MXFP4/FP6/FP8 linear layer #1666

balancap commented Feb 5, 2025

balancap commented Feb 5, 2025

[MX] Support mixed MXFP4/FP6/FP8 linear layer #1666

[MX] Support mixed MXFP4/FP6/FP8 linear layer #1666

Comments

balancap commented Feb 5, 2025

balancap commented Feb 5, 2025