Table of Contents
Fetching ...

Polynomial Preconditioning for the Action of the Matrix Square Root and Inverse Square Root

Andreas Frommer, Gustavo Ramirez-Hidalgo, Marcel Schweitzer, Manuel Tsolakis

TL;DR

This work develops polynomial preconditioning to accelerate the action of matrix square roots and inverse square roots within Krylov subspace methods, enabling smaller effective subspaces and reduced storage. By constructing $p(A)=(q(A))^2$ where $q$ approximates $z^{-1/2}$, the preconditioned operator $A(q(A))^2$ becomes better conditioned and accelerates convergence in both Hermitian and non-Hermitian settings. The authors propose multiple polynomial construction strategies—including Chebyshev expansions, Ritz-value interpolation, and contour-based error minimization—and provide spectral conditioning analyses with extensive numerical demonstrations on Poisson, lattice QCD, and graph Laplacian problems. The results indicate substantial speedups and storage savings over restarting or sketching approaches, highlighting the practical impact for large-scale scientific computing and data-analysis tasks involving matrix square roots and inverse square roots. The framework also clarifies branch considerations for the square root and outlines pathways to extend polynomial preconditioning to broader matrix-function computations.

Abstract

While preconditioning is a long-standing concept to accelerate iterative methods for linear systems, generalizations to matrix functions are still in their infancy. We go a further step in this direction, introducing polynomial preconditioning for Krylov subspace methods which approximate the action of the matrix square root and inverse square root on a vector. Preconditioning reduces the subspace size and therefore avoids the storage problem together with -- for non-Hermitian matrices -- the increased computational cost per iteration that arises in the unpreconditioned case. Polynomial preconditioning is an attractive alternative to current restarting or sketching approaches since it is simpler and computationally more efficient. We demonstrate this for several numerical examples.

Polynomial Preconditioning for the Action of the Matrix Square Root and Inverse Square Root

TL;DR

This work develops polynomial preconditioning to accelerate the action of matrix square roots and inverse square roots within Krylov subspace methods, enabling smaller effective subspaces and reduced storage. By constructing where approximates , the preconditioned operator becomes better conditioned and accelerates convergence in both Hermitian and non-Hermitian settings. The authors propose multiple polynomial construction strategies—including Chebyshev expansions, Ritz-value interpolation, and contour-based error minimization—and provide spectral conditioning analyses with extensive numerical demonstrations on Poisson, lattice QCD, and graph Laplacian problems. The results indicate substantial speedups and storage savings over restarting or sketching approaches, highlighting the practical impact for large-scale scientific computing and data-analysis tasks involving matrix square roots and inverse square roots. The framework also clarifies branch considerations for the square root and outlines pathways to extend polynomial preconditioning to broader matrix-function computations.

Abstract

While preconditioning is a long-standing concept to accelerate iterative methods for linear systems, generalizations to matrix functions are still in their infancy. We go a further step in this direction, introducing polynomial preconditioning for Krylov subspace methods which approximate the action of the matrix square root and inverse square root on a vector. Preconditioning reduces the subspace size and therefore avoids the storage problem together with -- for non-Hermitian matrices -- the increased computational cost per iteration that arises in the unpreconditioned case. Polynomial preconditioning is an attractive alternative to current restarting or sketching approaches since it is simpler and computationally more efficient. We demonstrate this for several numerical examples.
Paper Structure (16 sections, 4 theorems, 35 equations, 6 figures, 4 tables, 2 algorithms)

This paper contains 16 sections, 4 theorems, 35 equations, 6 figures, 4 tables, 2 algorithms.

Key Result

Proposition 2.1

[proposition]pro:poly_alpha Let $A \in \mathbb{C}^{n \times n}$, let $p$ be a polynomial and consider the function $z^\alpha$ for some $\alpha \in \mathbb{R}$. If $\alpha < 0$, further assume that the matrices $A$ and $p(A)$ do not have eigenvalues in $(-\infty,0]$. Then

Figures (6)

  • Figure 3.1: Illustration of effects of polynomial preconditioning for the discretized two-dimensional Laplace operator: Absolute/relative polynomial approximation error on the spectral interval (top left/right), effect on spectrum (bottom left), convergence history and predicted slope (bottom right); see the text for details.
  • Figure 6.1: Relative error when approximating $A^{-1/2}b$ with Chebyshev preconditioning polynomials of various degrees $d-1$ ($d = 1$ corresponds to an unpreconditioned method), where $A$ is the discretization of the three-dimensional Laplace operator and $b$ is a random vector of unit norm.
  • Figure 6.2: Results for approximating $(Q_\mu)^{2})^{-1/2}b$ with Arnoldi preconditioning polynomials of various degrees, $b$ a random vector. Left: relative error as a function of the Arnoldi basis size up to a value of $1\,600$. Right: time (in s) to compute $H_{m}^{-1/2}$ as a function of $m$ with SLEPc.
  • Figure 6.3: Results for approximating $((Q_\mu)^{2})^{-1/2}b$ with Arnoldi preconditioning polynomials of various degrees, $b$ a random vector. Solid lines: relative error $\|f_m-f^*\|/\|f^*\|$, dashed lines: error measure $\|f_m-f_{m+k}\|/\|f_{m+k}\}$ with $k = 64/d$.
  • Figure 6.4: Results for approximating $L^{1/2}b$ with Arnoldi preconditioning polynomials of various degrees ($d = 1$ corresponds to an unpreconditioned method), where $L$ is the graph Laplacian of the network Kamvar/Stanford and $b$ is a random vector of unit norm.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Proposition 2.1
  • proof
  • Proposition 3.1
  • proof
  • Example 3.2
  • Remark 4.1
  • Theorem 4.2
  • proof
  • Theorem 5.1
  • proof