Mixed precision HODLR matrices

Erin Carson; Xinye Chen; Xiaobo Liu

Mixed precision HODLR matrices

Erin Carson, Xinye Chen, Xiaobo Liu

TL;DR

This work develops an adaptive-precision framework for constructing and storing Hierarchical Off-Diagonal Low-Rank (HODLR) matrices in mixed precision. It proves that off-diagonal low-rank blocks can be compressed with controlled error and provides a sharp bound linking the global representation error to the per-level precisions and the hierarchy depth. The authors establish backward-error stability for matrix–vector products and LU factorization under the adaptive scheme, offering guidance on choosing working precision relative to the approximation error. Numerical experiments on kernel- and Schur-complement–like matrices validate the theory and demonstrate meaningful storage savings, with publicly available code to reproduce the results.

Abstract

Hierarchical matrix computations have attracted significant attention in the science and engineering community as exploiting data-sparse structures can significantly reduce the computational complexity of many important kernels. One particularly popular option within this class is the Hierarchical Off-Diagonal Low-Rank (HODLR) format. In this paper, we show that the off-diagonal blocks of HODLR matrices that are approximated by low-rank matrices can be represented in low precision without degenerating the quality of the overall approximation (with the error growth bounded by a factor of $2$). We also present an adaptive-precision scheme for constructing and storing HODLR matrices, and we prove that the use of mixed precision does not compromise the numerical stability of the resulting HODLR matrix--vector product and LU factorization. That is, the resulting error in these computations is not significantly greater than the case where we use one precision (say, double) for constructing and storing the HODLR matrix. Our analyses further give insight on how one must choose the working precision in HODLR matrix computations relative to the approximation error in order to not observe the effects of finite precision. Intuitively, when a HODLR matrix is subject to a high degree of approximation error, subsequent computations can be performed in a lower precision without detriment. We demonstrate the validity of our theoretical results through a range of numerical experiments.

Mixed precision HODLR matrices

TL;DR

Abstract

). We also present an adaptive-precision scheme for constructing and storing HODLR matrices, and we prove that the use of mixed precision does not compromise the numerical stability of the resulting HODLR matrix--vector product and LU factorization. That is, the resulting error in these computations is not significantly greater than the case where we use one precision (say, double) for constructing and storing the HODLR matrix. Our analyses further give insight on how one must choose the working precision in HODLR matrix computations relative to the approximation error in order to not observe the effects of finite precision. Intuitively, when a HODLR matrix is subject to a high degree of approximation error, subsequent computations can be performed in a lower precision without detriment. We demonstrate the validity of our theoretical results through a range of numerical experiments.

Paper Structure (13 sections, 9 theorems, 68 equations, 5 figures, 2 tables, 4 algorithms)

This paper contains 13 sections, 9 theorems, 68 equations, 5 figures, 2 tables, 4 algorithms.

Introduction
HODLR Matrices
Mixed-precision construction and representation of HODLR matrices
Mixed precision HODLR matrix representation
An adaptive-precision algorithm
Matrix--vector products
LU factorization
Numerical experiments
Global construction error
Matrix--vector products
LU factorization
Theoretical storage
Conclusions

Key Result

Lemma 2.4

\newlabellemma:approx-diag0 Let $\widetilde{H}$ be a $(\mathcal{T}_{\ell}, p, \varepsilon)$-HODLR matrix associated with $H$. Then for the HODLR matrices $\widetilde{H}^{(k)}_{ii}$, $i=1\colon 2^k$, at level $k \in \{0, \ldots, \ell\}$, it holds that

Figures (5)

Figure 1: Different matrices often show various norm distributions among layers. We compute the maximum off-diagonal block norm for the HODLR matrix with depth $\ell=6$ and the color indicates the size of the norm. According to the norm, using $\varepsilon=10^{-4}$ and the set of precisions $\{q52, bf16, fp16, fp32, fp64\}$, the algorithm chooses $\{\text{bf16, fp16, fp16, fp32, fp32, fp32}\}$ for the precision used for each layer for the matrix saylr3 while $\{\text{q52, fp32, fp32, fp32, fp32, fp32}\}$ is chosen for the matrix LeGresley_2508.
Figure 1: Reconstruction error for the adaptive-precision HODLR matrix. The $x$-axis indicates the value of $\varepsilon$ and the $y$-axis indicates relative global construction error.
Figure 2: Backward error of matrix--vector products for mixed precision HODLR matrix with $\ell=8$; The $x$-axis indicates the value of $\varepsilon$ and the $y$-axis indicates the relative backward error.
Figure 3: Backward error of LU factorization for mixed precision HODLR matrix. The $x$-axis indicates the value of $\varepsilon$ and the $y$-axis indicates relative backward error.
Figure 4: Storage savings of adaptive-precision HODLR matrices relative to uniform (double) precision HODLR matrices. The depth $\ell=8$; purples bars correspond to $\varepsilon=10^{-7}$, green bars indicate $\varepsilon=10^{-4}$, and blue bars indicate $\varepsilon=10^{-1}$.

Theorems & Definitions (19)

Definition 2.1: mrk20
Definition 2.2: $(\mathcal{T}_{\ell}, p)$-HODLR matrix mrk20
Definition 2.3: $(\mathcal{T}_{\ell}, p, \varepsilon)$-HODLR matrix
Lemma 2.4
Proof 1
Theorem 3.1: Global error in mixed-precision HODLR representation
Proof 2
Corollary 3.2: Local error of mixed-precision HODLR representation
Lemma 3.3
Proof 3
...and 9 more

Mixed precision HODLR matrices

TL;DR

Abstract

Mixed precision HODLR matrices

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (19)