Table of Contents
Fetching ...

Exploring and Exploiting Runtime Reconfigurable Floating Point Precision in Scientific Computing: a Case Study for Solving PDEs

Cong "Callie" Hao

TL;DR

This work tackles the high cost of 64-bit precision in scientific computing by proposing runtime reconfigurable precision (rr-precision) and a hardware-friendly RR2F2 multiplier that adaptively allocates bits between exponent and mantissa at runtime. It combines a data-range exploration showing dynamic local clustering with a novel flexible floating-point format <EB,MB,FX> and a precision-adjustment unit that trades exponent and mantissa to maintain fidelity. Empirical evaluation on FPGA across two PDE-based benchmarks demonstrates ~70% average error reduction versus half precision, with R2F2 matching 32-bit fidelity in scenarios where half precision fails and incurring only modest hardware overhead and no latency increase. The approach enables significant energy and memory savings for scientific simulations and is complemented by open-source code for broader adoption.

Abstract

Scientific computing applications, such as computational fluid dynamics and climate modeling, typically rely on 64-bit double-precision floating-point operations, which are extremely costly in terms of computation, memory, and energy. While the machine learning community has successfully utilized low-precision computations, scientific computing remains cautious due to concerns about numerical stability. To tackle this long-standing challenge, we propose a novel approach to dynamically adjust the floating-point data precision at runtime, maintaining computational fidelity using lower bit widths. We first conduct a thorough analysis of data range distributions during scientific simulations to identify opportunities and challenges for dynamic precision adjustment. We then propose a runtime reconfigurable, flexible floating-point multiplier (R2F2), which automatically and dynamically adjusts multiplication precision based on the current operands, ensuring accurate results with lower bit widths. Our evaluation shows that 16-bit R2F2 significantly reduces error rates by 70.2\% compared to standard half-precision, with resource overhead ranging from a 5% reduction to a 7% increase and no latency overhead. In two representative scientific computing applications, R2F2, using 16 or fewer bits, can achieve the same simulation results as 32-bit precision, while standard half precision will fail. This study pioneers runtime reconfigurable arithmetic, demonstrating great potential to enhance scientific computing efficiency. Code available at https://github.com/sharc-lab/R2F2.

Exploring and Exploiting Runtime Reconfigurable Floating Point Precision in Scientific Computing: a Case Study for Solving PDEs

TL;DR

This work tackles the high cost of 64-bit precision in scientific computing by proposing runtime reconfigurable precision (rr-precision) and a hardware-friendly RR2F2 multiplier that adaptively allocates bits between exponent and mantissa at runtime. It combines a data-range exploration showing dynamic local clustering with a novel flexible floating-point format <EB,MB,FX> and a precision-adjustment unit that trades exponent and mantissa to maintain fidelity. Empirical evaluation on FPGA across two PDE-based benchmarks demonstrates ~70% average error reduction versus half precision, with R2F2 matching 32-bit fidelity in scenarios where half precision fails and incurring only modest hardware overhead and no latency increase. The approach enables significant energy and memory savings for scientific simulations and is complemented by open-source code for broader adoption.

Abstract

Scientific computing applications, such as computational fluid dynamics and climate modeling, typically rely on 64-bit double-precision floating-point operations, which are extremely costly in terms of computation, memory, and energy. While the machine learning community has successfully utilized low-precision computations, scientific computing remains cautious due to concerns about numerical stability. To tackle this long-standing challenge, we propose a novel approach to dynamically adjust the floating-point data precision at runtime, maintaining computational fidelity using lower bit widths. We first conduct a thorough analysis of data range distributions during scientific simulations to identify opportunities and challenges for dynamic precision adjustment. We then propose a runtime reconfigurable, flexible floating-point multiplier (R2F2), which automatically and dynamically adjusts multiplication precision based on the current operands, ensuring accurate results with lower bit widths. Our evaluation shows that 16-bit R2F2 significantly reduces error rates by 70.2\% compared to standard half-precision, with resource overhead ranging from a 5% reduction to a 7% increase and no latency overhead. In two representative scientific computing applications, R2F2, using 16 or fewer bits, can achieve the same simulation results as 32-bit precision, while standard half precision will fail. This study pioneers runtime reconfigurable arithmetic, demonstrating great potential to enhance scientific computing efficiency. Code available at https://github.com/sharc-lab/R2F2.
Paper Structure (13 sections, 1 equation, 8 figures, 1 table)

This paper contains 13 sections, 1 equation, 8 figures, 1 table.

Figures (8)

  • Figure 1: 1D heat equation simulation using different data precision and heat initialization. (a)-(b): sin; (c)-(d): exp initialization.
  • Figure 2: Data distribution in the heat equation simulation. (a) implies that the data is globally wide while some clusters remain locally narrow; (b) and (c) demonstrate dynamic data range shift.
  • Figure 3: Average computation error using different configurations for floating point precision. X-axis shows the data precision in the format of (exponent, fraction); Y-axis is the average error percentage comparing with 32-bit floating point counterpart.
  • Figure 4: Details of the proposed flexible floating point multiplier
  • Figure 5: Logic to automatically adjust the precision.
  • ...and 3 more figures