Table of Contents
Fetching ...

Improved bounds for randomized Schatten norm estimation of numerically low-rank matrices

Ya-Chi Chu, Alice Cortinovis

TL;DR

This paper analyzes the variance of a Gaussian-sketch-based estimator for the Schatten-$2p$ norms of large matrices, focusing on the regime where the spectrum decays rapidly (numerically low-rank). It derives a tight first-order variance expansion $\mathrm{Var}(\hat{\theta}_{2p}) = \frac{2p^2}{k}\|A\|_{4p}^{4p} + \mathcal{O}(k^{-2})$ and provides an explicit second-order bound on the variance of the scaled estimator $\hat{\Theta}_{2p}$. The results are shown to yield substantially tighter variance bounds than the previous $\mathrm{Var}(\hat{\theta}_{2p})$ bound when singular values decay quickly, with numerical experiments confirming the improved accuracy for moderate sketch sizes $k$. The work also discusses practical guidance for selecting $k$, contrasts with Hutchinson-type estimators, and situates the method within the sketching and spectrum-moment estimation literature, highlighting its relevance to large-scale Schatten-norm calculations in randomized numerical linear algebra.

Abstract

In this work, we analyze the variance of a stochastic estimator for computing Schatten norms of matrices. The estimator extracts information from a single sketch of the matrix, that is, the product of the matrix with a few standard Gaussian random vectors. While this estimator has been proposed and used in the literature before, the existing variance bounds are often pessimistic. Our work provides a new upper bound and estimates of the variance of this estimator. These theoretical findings are supported by numerical experiments, demonstrating that the new bounds are significantly tighter than the existing ones in the case of numerically low-rank matrices.

Improved bounds for randomized Schatten norm estimation of numerically low-rank matrices

TL;DR

This paper analyzes the variance of a Gaussian-sketch-based estimator for the Schatten- norms of large matrices, focusing on the regime where the spectrum decays rapidly (numerically low-rank). It derives a tight first-order variance expansion and provides an explicit second-order bound on the variance of the scaled estimator . The results are shown to yield substantially tighter variance bounds than the previous bound when singular values decay quickly, with numerical experiments confirming the improved accuracy for moderate sketch sizes . The work also discusses practical guidance for selecting , contrasts with Hutchinson-type estimators, and situates the method within the sketching and spectrum-moment estimation literature, highlighting its relevance to large-scale Schatten-norm calculations in randomized numerical linear algebra.

Abstract

In this work, we analyze the variance of a stochastic estimator for computing Schatten norms of matrices. The estimator extracts information from a single sketch of the matrix, that is, the product of the matrix with a few standard Gaussian random vectors. While this estimator has been proposed and used in the literature before, the existing variance bounds are often pessimistic. Our work provides a new upper bound and estimates of the variance of this estimator. These theoretical findings are supported by numerical experiments, demonstrating that the new bounds are significantly tighter than the existing ones in the case of numerically low-rank matrices.
Paper Structure (16 sections, 13 theorems, 55 equations, 7 figures, 1 algorithm)

This paper contains 16 sections, 13 theorems, 55 equations, 7 figures, 1 algorithm.

Key Result

Lemma 1

For any $p$-cycle $\tau = (i_1, \ldots, i_{p})$ with $i_\ell \in [k]$, a real matrix $A \in \mathbb{R}^{m \times n}$, and an $n \times k$ random matrix $\Omega = (\omega_{j,\ell})$ where $\omega_{j, \ell}$ are i.i.d. entries with mean $0$ and variance $1$, where the expectation is over the randomness of $\Omega$.

Figures (7)

  • Figure 1: Comparison between the true variance of $\hat{\theta}_{6}$, the variance bound \ref{['eq:varKV-original']}, and our bound from Theorem \ref{['thm:secondorderbound']} for a $10 \times 10$ matrix $A$. Note that the $y$-axis has been "cut" for improved readability.
  • Figure 2: Circle notation.
  • Figure 3: Comparison of the variance of $\hat{\theta}_{2p}$, our bounds, our estimates, and \ref{['eq:varKV-original']} for the matrix from Example \ref{['ex:0.8']}.
  • Figure 4: Comparison of the normalized variance of $\hat{\theta}_{2p}$, our bounds, our estimates, and \ref{['eq:varKV-original']} for the matrix from Example \ref{['ex:1overi']} with diagonal entries $1, 1/4, 1/9, \ldots, 1/10000$.
  • Figure 5: Comparison of the variance of $\hat{\theta}_{2p}$, our bounds, our estimates, and \ref{['eq:varKV-original']} for the matrix from Example \ref{['ex:1overi']} with diagonal entries $1, 1/2^4, 1/3^4, \ldots, 1/100^4$.
  • ...and 2 more figures

Theorems & Definitions (34)

  • Lemma 1: li2014sketching and kong2017spectrum
  • proof
  • Theorem 2
  • Theorem 3
  • Corollary 4
  • Remark 5: A comparison with \ref{['eq:varKV-original']}
  • Remark 6: The constant in the $\mathcal{O}(\cdot)$
  • Remark 7: Rule of thumb for selecting $k$
  • Remark 8: A comparison with Hutchinson trace estimation
  • Lemma 9
  • ...and 24 more