Table of Contents
Fetching ...

Exploring FPGA designs for MX and beyond

Ebby Samson, Naveen Mellempudi, Wayne Luk, George A. Constantinides

TL;DR

The paper tackles the challenge of efficiently deploying neural networks with low-precision data representations on FPGAs by delivering the first open-source MX-compatible FPGA implementation, including IP cores, conversions, and a Brevitas-integrated exploration workflow. It presents open-source hardware components for MX concrete formats, analyzes how block-structured scale sharing affects area and accuracy, and demonstrates practical viability through ResNet-18 on ImageNet with both PTQ and QAT. Key findings show that narrow MX formats (e.g., MXINT4/5, MXINT6/7) offer strong FPGA area–accuracy trade-offs, with QAT further improving low-bit formats, and that MX formats outperform traditional per-tensor and per-channel quantization in this hardware context. The work provides a valuable open-source toolkit for researchers and practitioners to design, evaluate, and iterate MX-based accelerators on FPGAs, enabling broader exploration of mixed-precision and alternative scale computation strategies.

Abstract

A number of companies recently worked together to release the new Open Compute Project MX standard for low-precision computation, aimed at efficient neural network implementation. In this paper, we describe and evaluate the first open-source FPGA implementation of the arithmetic defined in the standard. Our designs fully support all the standard's concrete formats for conversion into and out of MX formats and for the standard-defined arithmetic operations, as well as arbitrary fixed-point and floating-point formats. Certain elements of the standard are left as implementation-defined, and we present the first concrete FPGA-inspired choices for these elements, which we outline in the paper. Our library of optimized hardware components is available open source, and can be used to build larger systems. For this purpose, we also describe and release an open-source Pytorch library for quantization into the new standard, integrated with the Brevitas library so that the community can develop novel neural network designs quantized with MX formats in mind. We demonstrate the usability and efficacy of our libraries via the implementation of example neural networks such as ResNet-18 on the ImageNet ILSVRC12 dataset. Our testing shows that MX is very effective for formats such as INT5 or FP6 which are not natively supported on GPUs. This gives FPGAs an advantage as they have the flexibility to implement a custom datapath and take advantage of the smaller area footprints offered by these formats.

Exploring FPGA designs for MX and beyond

TL;DR

The paper tackles the challenge of efficiently deploying neural networks with low-precision data representations on FPGAs by delivering the first open-source MX-compatible FPGA implementation, including IP cores, conversions, and a Brevitas-integrated exploration workflow. It presents open-source hardware components for MX concrete formats, analyzes how block-structured scale sharing affects area and accuracy, and demonstrates practical viability through ResNet-18 on ImageNet with both PTQ and QAT. Key findings show that narrow MX formats (e.g., MXINT4/5, MXINT6/7) offer strong FPGA area–accuracy trade-offs, with QAT further improving low-bit formats, and that MX formats outperform traditional per-tensor and per-channel quantization in this hardware context. The work provides a valuable open-source toolkit for researchers and practitioners to design, evaluate, and iterate MX-based accelerators on FPGAs, enabling broader exploration of mixed-precision and alternative scale computation strategies.

Abstract

A number of companies recently worked together to release the new Open Compute Project MX standard for low-precision computation, aimed at efficient neural network implementation. In this paper, we describe and evaluate the first open-source FPGA implementation of the arithmetic defined in the standard. Our designs fully support all the standard's concrete formats for conversion into and out of MX formats and for the standard-defined arithmetic operations, as well as arbitrary fixed-point and floating-point formats. Certain elements of the standard are left as implementation-defined, and we present the first concrete FPGA-inspired choices for these elements, which we outline in the paper. Our library of optimized hardware components is available open source, and can be used to build larger systems. For this purpose, we also describe and release an open-source Pytorch library for quantization into the new standard, integrated with the Brevitas library so that the community can develop novel neural network designs quantized with MX formats in mind. We demonstrate the usability and efficacy of our libraries via the implementation of example neural networks such as ResNet-18 on the ImageNet ILSVRC12 dataset. Our testing shows that MX is very effective for formats such as INT5 or FP6 which are not natively supported on GPUs. This gives FPGAs an advantage as they have the flexibility to implement a custom datapath and take advantage of the smaller area footprints offered by these formats.
Paper Structure (17 sections, 7 equations, 4 figures, 3 tables)

This paper contains 17 sections, 7 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Our implementation of the Dot standard-defined operation, grey symbols are used for formats with special encodings. Table \ref{['tab:dp_widths']} shows widths of signals. Multiplier inputs can be FP or INT but outputs are always INT.
  • Figure 2: An adder that normalises operands, similar to a floating-point adder.
  • Figure 3: Features of our quantizer, each block here can be customised or replaced to implement other quantization schemes. The $2^{emax_{elem}}$ term refers to the largest exponent possible in the element format and $A'$ represents a real valued tensor formed by applying the scale on $A_q$.
  • Figure 4: Error vs. estimated area of quantization schemes. Marker shape shows scale sharing regime. The grey dotted line is the FP32 baseline, other dotted lines show Pareto fronts. Pareto-optimal points are labelled with format and block size. Only schemes that offered more than 60% accuracy are shown.