Algorithm-agnostic low-rank approximation of operator monotone matrix functions

David Persson; Raphael A. Meyer; Christopher Musco

Algorithm-agnostic low-rank approximation of operator monotone matrix functions

David Persson, Raphael A. Meyer, Christopher Musco

TL;DR

Given a symmetric positive semidefinite matrix $\mathbf{A}$ and an operator-monotone function $f$ with $f(0)=0$, the paper shows that a near-optimal Nyström approximation $\widehat{\mathbf{A}}$ yields a near-optimal low-rank funNyström approximation $f(\widehat{\mathbf{A}})_{(k)}$ to $f(\mathbf{A})$ across nuclear, Frobenius, and operator norms, as well as eigenvalue estimates. The main advance is a general, construction-agnostic theory: if $\mathbf{A} \succeq \widehat{\mathbf{A}} \succeq 0$ and $\widehat{\mathbf{A}}_{(k)}$ is near-optimal, then $f(\widehat{\mathbf{A}})_{(k)}$ remains near-optimal for any continuous operator-monotone $f$, not just for subspace iteration. The results extend to many common Nyström-generation schemes (Krylov, sampling, and column-selection) by tying their guarantees to projection-based near-optimality, and they include explicit eigenvalue guarantees and a discussion of when the assumptions are necessary. Overall, the work enables reliable, inexpensive low-rank approximations of a broad class of matrix functions without computing $f(\mathbf{A})$ directly, with clear implications for high-dimensional computations and spectral estimation.

Abstract

Low-rank approximation of a matrix function, $f(A)$, is an important task in computational mathematics. Most methods require direct access to $f(A)$, which is often considerably more expensive than accessing $A$. Persson and Kressner (SIMAX 2023) avoid this issue for symmetric positive semidefinite matrices by proposing funNyström, which first constructs a Nyström approximation to $A$ using subspace iteration, and then uses the approximation to directly obtain a low-rank approximation for $f(A)$. They prove that the method yields a near-optimal approximation whenever $f$ is a continuous operator monotone function with $f(0) = 0$. We significantly generalize the results of Persson and Kressner beyond subspace iteration. We show that if $\widehat{A}$ is a near-optimal low-rank Nyström approximation to $A$ then $f(\widehat{A})$ is a near-optimal low-rank approximation to $f(A)$, independently of how $\widehat{A}$ is computed. Further, we show sufficient conditions for a basis $Q$ to produce a near-optimal Nyström approximation $\widehat{A} = AQ(Q^T AQ)^{\dagger} Q^T A$. We use these results to establish that many common low-rank approximation methods produce near-optimal Nyström approximations to $A$ and therefore to $f(A)$.

Algorithm-agnostic low-rank approximation of operator monotone matrix functions

TL;DR

Given a symmetric positive semidefinite matrix

and an operator-monotone function

with

, the paper shows that a near-optimal Nyström approximation

yields a near-optimal low-rank funNyström approximation

across nuclear, Frobenius, and operator norms, as well as eigenvalue estimates. The main advance is a general, construction-agnostic theory: if

and

is near-optimal, then

remains near-optimal for any continuous operator-monotone

, not just for subspace iteration. The results extend to many common Nyström-generation schemes (Krylov, sampling, and column-selection) by tying their guarantees to projection-based near-optimality, and they include explicit eigenvalue guarantees and a discussion of when the assumptions are necessary. Overall, the work enables reliable, inexpensive low-rank approximations of a broad class of matrix functions without computing

directly, with clear implications for high-dimensional computations and spectral estimation.

Abstract

Low-rank approximation of a matrix function,

, is an important task in computational mathematics. Most methods require direct access to

, which is often considerably more expensive than accessing

. Persson and Kressner (SIMAX 2023) avoid this issue for symmetric positive semidefinite matrices by proposing funNyström, which first constructs a Nyström approximation to

using subspace iteration, and then uses the approximation to directly obtain a low-rank approximation for

. They prove that the method yields a near-optimal approximation whenever

is a continuous operator monotone function with

. We significantly generalize the results of Persson and Kressner beyond subspace iteration. We show that if

is a near-optimal low-rank Nyström approximation to

then

is a near-optimal low-rank approximation to

, independently of how

is computed. Further, we show sufficient conditions for a basis

to produce a near-optimal Nyström approximation

. We use these results to establish that many common low-rank approximation methods produce near-optimal Nyström approximations to

and therefore to

Paper Structure (20 sections, 20 theorems, 67 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 20 sections, 20 theorems, 67 equations, 3 figures, 1 table, 1 algorithm.

Introduction
The Nyström and funNyström approximations
Our Contributions
Notation
Good Nyström approximations imply good funNyström approximations
Nyström to funNyström: Frobenius and nuclear norm guarantees
Nyström to funNyström: Operator norm guarantees
Nyström to funNyström: Eigenvalue guarantees
Good projections imply good Nyström approximations
Projections to Nyström: Frobenius norm guarantees
Projections to Nyström: Nuclear norm guarantees
Projections to Nyström: Operator norm guarantees
Projections to Nyström: Eigenvalue guarantees
Numerical experiments
Column subset selection
...and 5 more sections

Key Result

Theorem 1.1

Suppose that $\bm{A} \succeq \widehat{\bm{A}} \succeq \bm{0}$. Let $\bm{A}_{(k)}$ and $\widehat{\bm{A}}_{(k)}$ be optimal rank $k$ approximations to $\bm{A}$ and $\widehat{\bm{A}}$, respectively. Further suppose that for $\varepsilon \geq 0$, where $\|\cdot\|_*$ denotes the nuclear norm. Then for any continuous operator monotone function $f:[0,\infty) \to [0,\infty)$ we have

Figures (3)

Figure 1: Comparing $\varepsilon_{\text{projection}}, \varepsilon_{\text{Nyström}},$ and $\varepsilon_{\text{funNyström}}$ for column subset selection. Note that $\varepsilon_{\text{projection}}$ is significantly worse than the $\varepsilon_{\text{Nyström}}$ and $\varepsilon_{\text{funNyström}}$ since the orthogonal projection $\bm{Q}\bm{Q}^T$ zeros out all except $\ell$ rows of $\bm{A}$. In contrast, the Nyström approximation $\widehat{\bm{A}} = \bm{A}^{1/2} \bm{P}_{\bm{A}^{1/2} \bm{Q}} \bm{A}^{1/2}$ effectively performs half a step of subspace iteration on $\bm{Q}$, giving a better approximation.
Figure 2: Comparing $\varepsilon_{\text{projection}}, \varepsilon_{\text{Nyström}},$ and $\varepsilon_{\text{funNyström}}$ for Krylov iteration.
Figure 3: Comparing $\varepsilon_{\text{projection}}, \varepsilon_{\text{Nyström}},$ and $\varepsilon_{\text{funNyström}}$ for subspace iteration.

Theorems & Definitions (36)

Theorem 1.1
Theorem 1.2
Corollary 1.3
Corollary 1.4
Lemma 2.1
proof
Lemma 2.2
proof
Lemma 2.3
proof
...and 26 more

Algorithm-agnostic low-rank approximation of operator monotone matrix functions

TL;DR

Abstract

Algorithm-agnostic low-rank approximation of operator monotone matrix functions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (36)