Splitting methods based on the nonzero diagonal pattern for computing matrix functions

Majed Hamadi; Nezam Mahdavi-Amiri; Marcel Schweitzer

Splitting methods based on the nonzero diagonal pattern for computing matrix functions

Majed Hamadi, Nezam Mahdavi-Amiri, Marcel Schweitzer

TL;DR

The paper introduces a nonzero-diagonal pattern approach to compute matrix functions f(A) for sparse A by exploiting the sparsity pattern of p(A) from a low-degree polynomial approximation. It develops efficient machinery to identify which diagonals and submatrices influence each entry, enabling sparse representations and fast trace computations, and connects these ideas to probing methods. A specialized Toeplitz-focused algorithm shows how f(T) can be obtained from a small submatrix, yielding excellent scalability. Numerical experiments demonstrate competitive accuracy and performance for trace estimation, Toeplitz functions, and Estrada index calculations. The work offers a practical preprocessing alternative to graph coloring-based probing and provides a foundation for adaptive degree selection and error estimation in future research.

Abstract

We consider the task of approximating a matrix function $f(A)$, where $A$ is a matrix in which only a relatively small number of (not necessarily consecutive) sub- and superdiagonals contain nonzero entries. Approximating $f$ by a low-degree polynomial $p$ allows us to obtain sparse approximations to $f(A)$, which one can efficiently work with (while, in general, $f(A)$ is a dense matrix, even when $A$ is sparse). Our approach is based on carefully inspecting the locations where nonzeros can occur in $p(A)$, and identifying the entries in $A$ that influence them. In particular, we illustrate how this approach can be used for efficiently approximating the trace of $f(A)$ and identify how this approach is related to established (stochastic) probing methods for trace estimation. Another application area in which our approach works particularly well is the computation of functions of Toeplitz matrices. Here, studying the sparsity pattern of $p(A)$ allows us to reduce the computation of the whole matrix polynomial to that of a single small-scale submatrix, yielding an algorithm that scales exceptionally well to large problem sizes.

Splitting methods based on the nonzero diagonal pattern for computing matrix functions

TL;DR

Abstract

We consider the task of approximating a matrix function

, where

is a matrix in which only a relatively small number of (not necessarily consecutive) sub- and superdiagonals contain nonzero entries. Approximating

by a low-degree polynomial

allows us to obtain sparse approximations to

, which one can efficiently work with (while, in general,

is a dense matrix, even when

is sparse). Our approach is based on carefully inspecting the locations where nonzeros can occur in

, and identifying the entries in

that influence them. In particular, we illustrate how this approach can be used for efficiently approximating the trace of

and identify how this approach is related to established (stochastic) probing methods for trace estimation. Another application area in which our approach works particularly well is the computation of functions of Toeplitz matrices. Here, studying the sparsity pattern of

allows us to reduce the computation of the whole matrix polynomial to that of a single small-scale submatrix, yielding an algorithm that scales exceptionally well to large problem sizes.

Paper Structure (19 sections, 10 theorems, 57 equations, 3 figures, 2 tables, 2 algorithms)

This paper contains 19 sections, 10 theorems, 57 equations, 3 figures, 2 tables, 2 algorithms.

Introduction
Outline
Notation
Basic material -- polynomials of sparse and banded matrices
Nonzero-diagonal patterns of matrix powers
Efficiently computing $\mathcal{S}_k(A)$
Identifying relevant principal submatrices of $A$
Relation to probing methods
Computing a node partitioning based on our methodology
Delta trace estimator
Stochastic delta estimator
An efficient algorithm for computing functions of sparse Toeplitz matrices
Numerical experiments
Scaling of trace approximation methods with $k$
Comparison of algorithms for functions of Toeplitz matrices
...and 4 more sections

Key Result

Lemma 3.2

For $A\in \mathbb{C}^{n\times n}$, we have $\mathrm{ND}(A^k) \subseteq \mathcal{S}_k(A)$ for all $k\geq 0$.

Figures (3)

Figure 1: Illustration of Example \ref{['example:diagonal_spread']}. The matrix $A \in \mathbb{R}^{1000 \times 1000}$ contains diagonals corresponding to $\mathrm{ND}(A) = \{0,1,2,3\}\ \cup\ \{50\}\ \cup \ \{100\}$. We depict the sparsity pattern of ($\mathrm{a}$) $A$, (b) $A^4$ and (c) $A^8$.
Figure 2: Illustration of Example \ref{['example:diagonals']}. The matrix $A$ contains diagonals corresponding to $\mathrm{ND}(A) = \{-154,\dots, -146\}\ \cup\ \{-3, \dots, 3\}\ \cup \ \{ 148, \dots, 152\}\ \cup \ \mathcal{N}$, where $\mathcal{N} = \{388,\dots,392\}$ in ($\mathrm{a}$) and $\mathcal{N} = \{1228,\dots,1232\}$ in (b) and (c). The matrix is of size $n = 3000$ in ($\mathrm{a}$) and (b) and of size $n = 8000$ in (c). The diagonals from $\mathcal{N}$ are depicted in green, the other diagonals are depicted in red. The blue dots indicate the submatrix that is used for approximating $[\exp(A)]_{1500,1500}$ using a degree-$9$ polynomial approximation.
Figure 3: Comparison of relative errors of stochastic and deterministic methods for approximating the trace of $f(A)$, for $A$ with $k = 1,\dots,10$, and with $100$ repetitions for each $k$ for the stochastic methods.

Theorems & Definitions (28)

Definition 2.1
Example 3.1
Lemma 3.2
proof
Remark 3.3
Lemma 4.1
proof
Theorem 4.2
proof
Example 4.3
...and 18 more

Splitting methods based on the nonzero diagonal pattern for computing matrix functions

TL;DR

Abstract

Splitting methods based on the nonzero diagonal pattern for computing matrix functions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (28)