Table of Contents
Fetching ...

Splitting methods based on the nonzero diagonal pattern for computing matrix functions

Majed Hamadi, Nezam Mahdavi-Amiri, Marcel Schweitzer

TL;DR

The paper introduces a nonzero-diagonal pattern approach to compute matrix functions f(A) for sparse A by exploiting the sparsity pattern of p(A) from a low-degree polynomial approximation. It develops efficient machinery to identify which diagonals and submatrices influence each entry, enabling sparse representations and fast trace computations, and connects these ideas to probing methods. A specialized Toeplitz-focused algorithm shows how f(T) can be obtained from a small submatrix, yielding excellent scalability. Numerical experiments demonstrate competitive accuracy and performance for trace estimation, Toeplitz functions, and Estrada index calculations. The work offers a practical preprocessing alternative to graph coloring-based probing and provides a foundation for adaptive degree selection and error estimation in future research.

Abstract

We consider the task of approximating a matrix function $f(A)$, where $A$ is a matrix in which only a relatively small number of (not necessarily consecutive) sub- and superdiagonals contain nonzero entries. Approximating $f$ by a low-degree polynomial $p$ allows us to obtain sparse approximations to $f(A)$, which one can efficiently work with (while, in general, $f(A)$ is a dense matrix, even when $A$ is sparse). Our approach is based on carefully inspecting the locations where nonzeros can occur in $p(A)$, and identifying the entries in $A$ that influence them. In particular, we illustrate how this approach can be used for efficiently approximating the trace of $f(A)$ and identify how this approach is related to established (stochastic) probing methods for trace estimation. Another application area in which our approach works particularly well is the computation of functions of Toeplitz matrices. Here, studying the sparsity pattern of $p(A)$ allows us to reduce the computation of the whole matrix polynomial to that of a single small-scale submatrix, yielding an algorithm that scales exceptionally well to large problem sizes.

Splitting methods based on the nonzero diagonal pattern for computing matrix functions

TL;DR

The paper introduces a nonzero-diagonal pattern approach to compute matrix functions f(A) for sparse A by exploiting the sparsity pattern of p(A) from a low-degree polynomial approximation. It develops efficient machinery to identify which diagonals and submatrices influence each entry, enabling sparse representations and fast trace computations, and connects these ideas to probing methods. A specialized Toeplitz-focused algorithm shows how f(T) can be obtained from a small submatrix, yielding excellent scalability. Numerical experiments demonstrate competitive accuracy and performance for trace estimation, Toeplitz functions, and Estrada index calculations. The work offers a practical preprocessing alternative to graph coloring-based probing and provides a foundation for adaptive degree selection and error estimation in future research.

Abstract

We consider the task of approximating a matrix function , where is a matrix in which only a relatively small number of (not necessarily consecutive) sub- and superdiagonals contain nonzero entries. Approximating by a low-degree polynomial allows us to obtain sparse approximations to , which one can efficiently work with (while, in general, is a dense matrix, even when is sparse). Our approach is based on carefully inspecting the locations where nonzeros can occur in , and identifying the entries in that influence them. In particular, we illustrate how this approach can be used for efficiently approximating the trace of and identify how this approach is related to established (stochastic) probing methods for trace estimation. Another application area in which our approach works particularly well is the computation of functions of Toeplitz matrices. Here, studying the sparsity pattern of allows us to reduce the computation of the whole matrix polynomial to that of a single small-scale submatrix, yielding an algorithm that scales exceptionally well to large problem sizes.
Paper Structure (19 sections, 10 theorems, 57 equations, 3 figures, 2 tables, 2 algorithms)

This paper contains 19 sections, 10 theorems, 57 equations, 3 figures, 2 tables, 2 algorithms.

Key Result

Lemma 3.2

For $A\in \mathbb{C}^{n\times n}$, we have $\mathrm{ND}(A^k) \subseteq \mathcal{S}_k(A)$ for all $k\geq 0$.

Figures (3)

  • Figure 1: Illustration of Example \ref{['example:diagonal_spread']}. The matrix $A \in \mathbb{R}^{1000 \times 1000}$ contains diagonals corresponding to $\mathrm{ND}(A) = \{0,1,2,3\}\ \cup\ \{50\}\ \cup \ \{100\}$. We depict the sparsity pattern of ($\mathrm{a}$) $A$, (b) $A^4$ and (c) $A^8$.
  • Figure 2: Illustration of Example \ref{['example:diagonals']}. The matrix $A$ contains diagonals corresponding to $\mathrm{ND}(A) = \{-154,\dots, -146\}\ \cup\ \{-3, \dots, 3\}\ \cup \ \{ 148, \dots, 152\}\ \cup \ \mathcal{N}$, where $\mathcal{N} = \{388,\dots,392\}$ in ($\mathrm{a}$) and $\mathcal{N} = \{1228,\dots,1232\}$ in (b) and (c). The matrix is of size $n = 3000$ in ($\mathrm{a}$) and (b) and of size $n = 8000$ in (c). The diagonals from $\mathcal{N}$ are depicted in green, the other diagonals are depicted in red. The blue dots indicate the submatrix that is used for approximating $[\exp(A)]_{1500,1500}$ using a degree-$9$ polynomial approximation.
  • Figure 3: Comparison of relative errors of stochastic and deterministic methods for approximating the trace of $f(A)$, for $A$ with $k = 1,\dots,10$, and with $100$ repetitions for each $k$ for the stochastic methods.

Theorems & Definitions (28)

  • Definition 2.1
  • Example 3.1
  • Lemma 3.2
  • proof
  • Remark 3.3
  • Lemma 4.1
  • proof
  • Theorem 4.2
  • proof
  • Example 4.3
  • ...and 18 more