Table of Contents
Fetching ...

Hardware for converting floating-point to the microscaling (MX) format

Danila Gorodecky, Leonel Sousa

TL;DR

An algorithm and a memory-free hardware model for converting 32 single-precision floating-point numbers to MX-format, a reduced representation of floating-point numbers, was presented and experimental results demonstrate.

Abstract

This paper proposes hardware converters for the microscaling format (MX-format), a reduced representation of floating-point numbers. We present an algorithm and a memory-free hardware model for converting 32 single-precision floating-point numbers to MX-format. The proposed model supports six different types of MX-format: E5M2, E4M3, E3M2, E2M3, E2M1, and INT8. The conversion process consists of three steps: calculating the maximum absolute value among 32 inputs, generating a shared scale, and producing 32 outputs in the selected MX-format type. The hardware converters were implemented in FPGA, and experimental results demonstrate.

Hardware for converting floating-point to the microscaling (MX) format

TL;DR

An algorithm and a memory-free hardware model for converting 32 single-precision floating-point numbers to MX-format, a reduced representation of floating-point numbers, was presented and experimental results demonstrate.

Abstract

This paper proposes hardware converters for the microscaling format (MX-format), a reduced representation of floating-point numbers. We present an algorithm and a memory-free hardware model for converting 32 single-precision floating-point numbers to MX-format. The proposed model supports six different types of MX-format: E5M2, E4M3, E3M2, E2M3, E2M1, and INT8. The conversion process consists of three steps: calculating the maximum absolute value among 32 inputs, generating a shared scale, and producing 32 outputs in the selected MX-format type. The hardware converters were implemented in FPGA, and experimental results demonstrate.

Paper Structure

This paper contains 8 sections, 2 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: The ratio between FP32 and MX-format: $S_1,S_2,\dots,S_{32}$ - sign bits, $EV_1,EV_2,\dots,EV_{32}$ - exponent and $MV_1,MV_2,\dots,MV_{32}$ - mantissa of $V_1,V_2,\dots,V_{32}$, $X$ - shared scale, $EK_1,EK_2,\dots,EK_{32}$ - exponent and $MR_1,MR_2,\dots,MR_{32}$ - mantissa of $P_1,P_2,\dots,P_{32}$, where $K$ and $R$ - nimber of bits of exponent and mantissa, respectively, of MX-format
  • Figure 2: Architecture of the converter