Splitting methods based on the nonzero diagonal pattern for computing matrix functions
Majed Hamadi, Nezam Mahdavi-Amiri, Marcel Schweitzer
TL;DR
The paper introduces a nonzero-diagonal pattern approach to compute matrix functions f(A) for sparse A by exploiting the sparsity pattern of p(A) from a low-degree polynomial approximation. It develops efficient machinery to identify which diagonals and submatrices influence each entry, enabling sparse representations and fast trace computations, and connects these ideas to probing methods. A specialized Toeplitz-focused algorithm shows how f(T) can be obtained from a small submatrix, yielding excellent scalability. Numerical experiments demonstrate competitive accuracy and performance for trace estimation, Toeplitz functions, and Estrada index calculations. The work offers a practical preprocessing alternative to graph coloring-based probing and provides a foundation for adaptive degree selection and error estimation in future research.
Abstract
We consider the task of approximating a matrix function $f(A)$, where $A$ is a matrix in which only a relatively small number of (not necessarily consecutive) sub- and superdiagonals contain nonzero entries. Approximating $f$ by a low-degree polynomial $p$ allows us to obtain sparse approximations to $f(A)$, which one can efficiently work with (while, in general, $f(A)$ is a dense matrix, even when $A$ is sparse). Our approach is based on carefully inspecting the locations where nonzeros can occur in $p(A)$, and identifying the entries in $A$ that influence them. In particular, we illustrate how this approach can be used for efficiently approximating the trace of $f(A)$ and identify how this approach is related to established (stochastic) probing methods for trace estimation. Another application area in which our approach works particularly well is the computation of functions of Toeplitz matrices. Here, studying the sparsity pattern of $p(A)$ allows us to reduce the computation of the whole matrix polynomial to that of a single small-scale submatrix, yielding an algorithm that scales exceptionally well to large problem sizes.
