Table of Contents
Fetching ...

Iterative Refinement with Low-Precision Posits

James Quinlan, E. Theodore L. Omtzigt

TL;DR

This work addresses solving $A x = b$ for nonsingular $A$ using low‑precision posit arithmetic within a mixed‑precision iterative refinement framework. It combines low‑precision LU factorization, high‑precision residual corrections, and a quire‑assisted deferred rounding strategy, augmented by two‑sided row/column equilibration to reduce conditioning. The authors demonstrate that 16‑bit posits with equilibration can achieve accuracy comparable to IEEE fp16 on a set of sparse matrices, with two‑sided scaling yielding robust convergence without specialized preconditioners. The findings suggest substantial potential for efficiency improvements in HPC/AI workloads while maintaining numerical reliability, and they lay groundwork for integrating posits with advanced preconditioning and scaling strategies in future work.

Abstract

This research investigates using a mixed-precision iterative refinement method using posit numbers instead of the standard IEEE floating-point format. The method is applied to solve a general linear system represented by the equation $Ax = b$, where $A$ is a large sparse matrix. Various scaling techniques, such as row and column equilibration, map the matrix entries to higher-density regions of machine numbers before performing the $O(n^3)$ factorization operation. Low-precision LU factorization followed by forward/backward substitution provides an initial estimate. The results demonstrate that a 16-bit posit configuration combined with equilibration produces accuracy comparable to IEEE half-precision (fp16), indicating a potential for achieving a balance between efficiency and accuracy.

Iterative Refinement with Low-Precision Posits

TL;DR

This work addresses solving for nonsingular using low‑precision posit arithmetic within a mixed‑precision iterative refinement framework. It combines low‑precision LU factorization, high‑precision residual corrections, and a quire‑assisted deferred rounding strategy, augmented by two‑sided row/column equilibration to reduce conditioning. The authors demonstrate that 16‑bit posits with equilibration can achieve accuracy comparable to IEEE fp16 on a set of sparse matrices, with two‑sided scaling yielding robust convergence without specialized preconditioners. The findings suggest substantial potential for efficiency improvements in HPC/AI workloads while maintaining numerical reliability, and they lay groundwork for integrating posits with advanced preconditioning and scaling strategies in future work.

Abstract

This research investigates using a mixed-precision iterative refinement method using posit numbers instead of the standard IEEE floating-point format. The method is applied to solve a general linear system represented by the equation , where is a large sparse matrix. Various scaling techniques, such as row and column equilibration, map the matrix entries to higher-density regions of machine numbers before performing the factorization operation. Low-precision LU factorization followed by forward/backward substitution provides an initial estimate. The results demonstrate that a 16-bit posit configuration combined with equilibration produces accuracy comparable to IEEE half-precision (fp16), indicating a potential for achieving a balance between efficiency and accuracy.
Paper Structure (12 sections, 10 equations, 1 figure, 6 tables, 5 algorithms)

This paper contains 12 sections, 10 equations, 1 figure, 6 tables, 5 algorithms.

Figures (1)

  • Figure 1: A posit$\langle$16,2$\rangle$ with 7-bit fraction and 2-bit exponent.