You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According the MX paper, and macore generally the large quantization literature, there is advantages to use different bitwidth for weights, activations and gradients. It would be very useful in mx_mm and MXLinear to support this more general setting.
The text was updated successfully, but these errors were encountered:
Blackwell hardware natively supports any combination of MXFP4/FP6/FP8 in matmuls. See PTX and Cutlass documentation:
According the MX paper, and macore generally the large quantization literature, there is advantages to use different bitwidth for weights, activations and gradients. It would be very useful in
mx_mm
andMXLinear
to support this more general setting.The text was updated successfully, but these errors were encountered: