Avoiding breakdown in incomplete factorizations in low precision arithmetic

Jennifer Scott; Miroslav Tůma

Avoiding breakdown in incomplete factorizations in low precision arithmetic

Jennifer Scott, Miroslav Tůma

TL;DR

This work investigates constructing and employing half-precision ($fp16$) incomplete factorization preconditioners, particularly incomplete Cholesky factorizations, to solve large sparse SPD systems with double-precision accuracy via mixed-precision iterative refinement. It identifies breakdown mechanisms ($B1$–$B3$) that can occur during incomplete factorization in low precision and introduces prescaling and global diagonal modifications to prevent breakdown, along with a shift strategy to restart factorization when needed. The authors develop level-based ($IC(\ell)$) half-precision factorization software and integrate these factors within LU-IR and Krylov-IR frameworks (CG-IR and GMRES-IR), validating the approach on SuiteSparse SPD problems. Results show that fp16 incomplete preconditioners can achieve performance close to fp64 in many cases, at the cost of additional Krylov iterations, and argue for broader exploration of fp16 direct solvers and other preconditioners as hardware support improves. Overall, the study demonstrates the practical viability of robust, memory-efficient half-precision preconditioners for large-scale sparse linear systems with potential for significant memory and energy savings in real-world applications.

Abstract

The emergence of low precision floating-point arithmetic in computer hardware has led to a resurgence of interest in the use of mixed precision numerical linear algebra. For linear systems of equations, there has been renewed enthusiasm for mixed precision variants of iterative refinement. We consider the iterative solution of large sparse systems using incomplete factorization preconditioners. The focus is on the robust computation of such preconditioners in half precision arithmetic and employing them to solve symmetric positive definite systems to higher precision accuracy; however, the proposed ideas can be applied more generally. Even for well-conditioned problems, incomplete factorizations can break down when small entries occur on the diagonal during the factorization. When using half precision arithmetic, overflows are an additional possible source of breakdown. We examine how breakdowns can be avoided and we implement our strategies within new half precision Fortran sparse incomplete Cholesky factorization software. Results are reported for a range of problems from practical applications. These demonstrate that, even for highly ill-conditioned problems, half precision preconditioners can potentially replace double precision preconditioners, although unsurprisingly this may be at the cost of additional iterations of a Krylov solver.

Avoiding breakdown in incomplete factorizations in low precision arithmetic

TL;DR

This work investigates constructing and employing half-precision (

) incomplete factorization preconditioners, particularly incomplete Cholesky factorizations, to solve large sparse SPD systems with double-precision accuracy via mixed-precision iterative refinement. It identifies breakdown mechanisms (

–

) that can occur during incomplete factorization in low precision and introduces prescaling and global diagonal modifications to prevent breakdown, along with a shift strategy to restart factorization when needed. The authors develop level-based (

) half-precision factorization software and integrate these factors within LU-IR and Krylov-IR frameworks (CG-IR and GMRES-IR), validating the approach on SuiteSparse SPD problems. Results show that fp16 incomplete preconditioners can achieve performance close to fp64 in many cases, at the cost of additional Krylov iterations, and argue for broader exploration of fp16 direct solvers and other preconditioners as hardware support improves. Overall, the study demonstrates the practical viability of robust, memory-efficient half-precision preconditioners for large-scale sparse linear systems with potential for significant memory and energy savings in real-world applications.

Abstract

Paper Structure (16 sections, 2 equations, 1 figure, 9 tables, 8 algorithms)

This paper contains 16 sections, 2 equations, 1 figure, 9 tables, 8 algorithms.

Introduction
Incomplete factorizations
A brief introduction to incomplete factorizations
Breakdown during incomplete factorizations
Prescaling of the system matrix
Global modifications to prevent breakdown
Using the low precision factors
LU and Cholesky factorization based iterative refinement
LU-IR and Krylov-IR using low precision factors
Generalisation to low precision incomplete factors
Numerical experiments
Results for IC-LU-IR
Dependence of the iteration counts on the CG tolerance
Results for $IC(0)$
Results for $IC(\ell)$
...and 1 more sections

Figures (1)

Figure 4.1: IC-CG-IR total iteration counts for the fp16+$IC(\ell)$ preconditioner (dashed line) and fp64+$IC(\ell)$ preconditioner (solid line) with $l$ ranging from 1 to 9 for problems UTEP/Dubcova2 (left) and Cylshell/s2rmt3m1 (right).

Avoiding breakdown in incomplete factorizations in low precision arithmetic

TL;DR

Abstract

Avoiding breakdown in incomplete factorizations in low precision arithmetic

Authors

TL;DR

Abstract

Table of Contents

Figures (1)