Table of Contents
Fetching ...

Fast Direct Solvers

Per-Gunnar Martinsson, Michael O'Neil

TL;DR

The paper surveys fast direct solvers that construct approximate inverses or factorizations for matrices from elliptic PDEs or integral equations by exploiting data-sparsity, particularly rank-deficient off-diagonal blocks. It unifies approaches based on HODLR/HSS/HBS formats, nested dissection, and boundary-integral formulations, showing how potential operators exhibit inherently low-rank interactions that enable near-linear complexity in many settings. Key contributions include detailed treatments of HODLR inverses, recursive skeletonization, strong vs. weak admissibility, and randomized/black-box compression techniques, plus guidance on selecting formats for PDEs, BIEs, or kernel matrices. The survey highlights practical considerations, such as stability, HPC implementations, and software availability, while outlining open problems in high-frequency regimes and rigorous analysis. Overall, fast direct solvers offer a principled, scalable alternative to iterative methods for challenging problems and support operator algebra, preconditioning, and multiphysics couplings with significant performance advantages on modern hardware.

Abstract

This survey describes a class of methods known as "fast direct solvers". These algorithms address the problem of solving a system of linear equations $\boldsymbol{Ax}=\boldsymbol{b}$ arising from the discretization of either an elliptic PDE or of an associated integral equation. The matrix $\boldsymbol{A}$ will be sparse when the PDE is discretized directly, and dense when an integral equation formulation is used. In either case, industry practice for large scale problems has for decades been to use iterative solvers such as multigrid, GMRES, or conjugate gradients. A direct solver, in contrast, builds an approximation to the inverse of $\boldsymbol{A}$, or alternatively, an easily invertible factorization (e.g. LU or Cholesky). A major development in numerical analysis in the last couple of decades has been the emergence of algorithms for constructing such factorizations or performing such inversions in linear or close to linear time. Such methods must necessarily exploit that the matrix $\boldsymbol{A}^{-1}$ is "data-sparse", typically in the sense that it can be tessellated into blocks that have low numerical rank. This survey provides a unifying context to both sparse and dense fast direct solvers, introduces key concepts with a minimum of notational overhead, and provides guidance to help a user determine the best method to use for a given application.

Fast Direct Solvers

TL;DR

The paper surveys fast direct solvers that construct approximate inverses or factorizations for matrices from elliptic PDEs or integral equations by exploiting data-sparsity, particularly rank-deficient off-diagonal blocks. It unifies approaches based on HODLR/HSS/HBS formats, nested dissection, and boundary-integral formulations, showing how potential operators exhibit inherently low-rank interactions that enable near-linear complexity in many settings. Key contributions include detailed treatments of HODLR inverses, recursive skeletonization, strong vs. weak admissibility, and randomized/black-box compression techniques, plus guidance on selecting formats for PDEs, BIEs, or kernel matrices. The survey highlights practical considerations, such as stability, HPC implementations, and software availability, while outlining open problems in high-frequency regimes and rigorous analysis. Overall, fast direct solvers offer a principled, scalable alternative to iterative methods for challenging problems and support operator algebra, preconditioning, and multiphysics couplings with significant performance advantages on modern hardware.

Abstract

This survey describes a class of methods known as "fast direct solvers". These algorithms address the problem of solving a system of linear equations arising from the discretization of either an elliptic PDE or of an associated integral equation. The matrix will be sparse when the PDE is discretized directly, and dense when an integral equation formulation is used. In either case, industry practice for large scale problems has for decades been to use iterative solvers such as multigrid, GMRES, or conjugate gradients. A direct solver, in contrast, builds an approximation to the inverse of , or alternatively, an easily invertible factorization (e.g. LU or Cholesky). A major development in numerical analysis in the last couple of decades has been the emergence of algorithms for constructing such factorizations or performing such inversions in linear or close to linear time. Such methods must necessarily exploit that the matrix is "data-sparse", typically in the sense that it can be tessellated into blocks that have low numerical rank. This survey provides a unifying context to both sparse and dense fast direct solvers, introduces key concepts with a minimum of notational overhead, and provides guidance to help a user determine the best method to use for a given application.

Paper Structure

This paper contains 65 sections, 1 theorem, 110 equations, 23 figures.

Key Result

Lemma 7.1

Suppose that $\bm{\mathsf{A}}$ is an invertible $N\times N$ matrix, that $K$ is a positive integer such that $K < N$, and that $\bm{\mathsf{A}}$ admits the factorization Then where provided all inverses that appear in the formulas exist. Moreover, $\hbox{rank}(\bm{\mathsf{G}}) = N-K$.

Figures (23)

  • Figure 1: A representative rank-structured matrix, such as an "$\mathcal{H}$-matrix" or "HODLR matrix".
  • Figure 1: (a) Sparsity pattern of the tridiagonal coefficient matrix $\bm{\mathsf{A}}$ introduced in Section \ref{['sec:onedimbvp']} for $N=6$. (b) The inverse $\bm{\mathsf{A}}^{-1}$ is semi-separable, which is to say that its upper triangular part (the blue elements) and its lower triangular part (the red elements) are each restrictions of rank-1 matrices. The two rank-one matrices must agree on the diagonal (purple), so a semi-separable matrix is defined by $3N-2$ real numbers, just like a tridiagonal one. (c,d) The sparsity pattern of the factors $\bm{\mathsf{L}}$ and $\bm{\mathsf{U}}$ in the factorization $\bm{\mathsf{A}} = \bm{\mathsf{L}}\bm{\mathsf{U}}$. Since $\bm{\mathsf{L}}(i,i)=1$, there are again $3N-2$ independent numbers.
  • Figure 1: A numerical example illustrating the behavior of the finite difference (FD) method of Section \ref{['sec:onedimbvp']} and the integral equation (IE) method of Section \ref{['sec:onedimbvpIE']}. We first considered a problem on the domain $[0,1]$ with a non-oscillatory solution "(non-osc)" where $m(x) = 100(1+x)\cos(x)$ and $g(x) = 1 + \cos(1+x)$. We then swapped the sign of $m$ (but kept everything else the same) to get a problem with an oscillatory solution "(osc)". A key point here is that the condition number of the integral equation formulation does not grow with $N$. A secondary point is that elliptic problems with oscillatory solutions are far more challenging.
  • Figure 1: Illustration of how a binary tree $\{I_{\tau}\}_{\tau=1}^{15}$ of subindices to the full index vector $I = [1,2,\dots,N]$ induces a hierarchical tessellation of a matrix, shown for the case of a tree of depth $L=3$. The root node is $I_{1} = I$.
  • Figure 1: Illustration of the importance of ordering in reducing fill-in when factorizing sparse matrices. (a) Let $\bm{\mathsf{A}}$ be a positive definite matrix with the sparsity pattern show in the figure. (b) The Cholesky factor $\bm{\mathsf{C}}$ such that $\bm{\mathsf{A}} = \bm{\mathsf{C}}\,\bm{\mathsf{C}}^{*}$. (c) Form a new matrix $\bm{\mathsf{B}}$ by simply permuting the rows and columns of $\bm{\mathsf{A}}$. (d) The Cholesky factor $\bm{\mathsf{R}}$ such that $\bm{\mathsf{B}} = \bm{\mathsf{R}}\,\bm{\mathsf{R}}^{*}$.
  • ...and 18 more figures

Theorems & Definitions (13)

  • Remark 1.1: High frequency problems
  • Remark 3.1: Inhomogeneous boundary data
  • Remark 3.2: A curious connection
  • Remark 4.1: Recompression
  • Remark 5.1: Pivoting and numerical stability
  • Remark 6.1
  • Remark 6.2: Linear algebraic vs. analytic dimension reduction
  • Lemma 7.1: Variation of Woodbury
  • Remark 7.2: Recursive skeletonization
  • Remark 8.1
  • ...and 3 more