Table of Contents
Fetching ...

Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations

Yijun Dong, Per-Gunnar Martinsson, Yuji Nakatsukasa

TL;DR

This work analyzes the accuracy of singular vectors from randomized SVD in terms of canonical angles between true and computed subspaces. It introduces space-agnostic probabilistic bounds that remain valid without knowledge of the true subspaces, and shows these bounds are asymptotically tight under moderate oversampling when $l=\Omega(k)$; it also provides unbiased estimators and posterior residual-based guarantees that depend on the computed residuals and spectra. Through extensive numerical experiments on synthetic and real matrices, the authors demonstrate when space-agnostic bounds or posterior bounds dominate and how oversampling and power iterations interact under a fixed computational budget. The results offer practical guidance for selecting oversampling and power-iteration parameters and contribute efficient tools for certifying subspace accuracy in large-scale randomized subspace approximations.

Abstract

Randomized subspace approximation with "matrix sketching" is an effective approach for constructing approximate partial singular value decompositions (SVDs) of large matrices. The performance of such techniques has been extensively analyzed, and very precise estimates on the distribution of the residual errors have been derived. However, our understanding of the accuracy of the computed singular vectors (measured in terms of the canonical angles between the spaces spanned by the exact and the computed singular vectors, respectively) remains relatively limited. In this work, we present practical bounds and estimates for canonical angles of randomized subspace approximation that can be computed efficiently either a priori or a posteriori, without assuming prior knowledge of the true singular subspaces. Under moderate oversampling in the randomized SVD, our prior probabilistic bounds are asymptotically tight and can be computed efficiently, while bringing a clear insight into the balance between oversampling and power iterations given a fixed budget on the number of matrix-vector multiplications. The numerical experiments demonstrate the empirical effectiveness of these canonical angle bounds and estimates on different matrices under various algorithmic choices for the randomized SVD.

Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations

TL;DR

This work analyzes the accuracy of singular vectors from randomized SVD in terms of canonical angles between true and computed subspaces. It introduces space-agnostic probabilistic bounds that remain valid without knowledge of the true subspaces, and shows these bounds are asymptotically tight under moderate oversampling when ; it also provides unbiased estimators and posterior residual-based guarantees that depend on the computed residuals and spectra. Through extensive numerical experiments on synthetic and real matrices, the authors demonstrate when space-agnostic bounds or posterior bounds dominate and how oversampling and power iterations interact under a fixed computational budget. The results offer practical guidance for selecting oversampling and power-iteration parameters and contribute efficient tools for certifying subspace accuracy in large-scale randomized subspace approximations.

Abstract

Randomized subspace approximation with "matrix sketching" is an effective approach for constructing approximate partial singular value decompositions (SVDs) of large matrices. The performance of such techniques has been extensively analyzed, and very precise estimates on the distribution of the residual errors have been derived. However, our understanding of the accuracy of the computed singular vectors (measured in terms of the canonical angles between the spaces spanned by the exact and the computed singular vectors, respectively) remains relatively limited. In this work, we present practical bounds and estimates for canonical angles of randomized subspace approximation that can be computed efficiently either a priori or a posteriori, without assuming prior knowledge of the true singular subspaces. Under moderate oversampling in the randomized SVD, our prior probabilistic bounds are asymptotically tight and can be computed efficiently, while bringing a clear insight into the balance between oversampling and power iterations given a fixed budget on the number of matrix-vector multiplications. The numerical experiments demonstrate the empirical effectiveness of these canonical angle bounds and estimates on different matrices under various algorithmic choices for the randomized SVD.
Paper Structure (22 sections, 8 theorems, 71 equations, 18 figures, 1 table, 2 algorithms)

This paper contains 22 sections, 8 theorems, 71 equations, 18 figures, 1 table, 2 algorithms.

Key Result

Theorem 3.1

\newlabelthm:space_agnostic_bounds0 For a rank-$l$ randomized SVD (algo:rsvd_power_iterations) with a Gaussian embedding $\boldsymbol{\Omega}$ and $q \ge 0$ power iterations, when the oversampled rank $l$ satisfies $l = \Omega\left(k\right)$ (where $k$ is the target rank, $k < l < r = \mathop{\mat for all $i \in [k]$, where $\epsilon_1 = \Theta\left(\sqrt{\frac{k}{l}}\right)$ and $\epsilon_2 = \T

Figures (18)

  • Figure 1: Synthetic Gaussian with the slower spectral decay. $k=50$, $l=200$, $q=0,1$.
  • Figure 1: Synthetic Gaussian with the slower spectral decay. $k=50$, $l=200$, $q=0,1$.
  • Figure 2: Synthetic Gaussian with the slower spectral decay. $k=50$, $l=80$, $q=0,1$.
  • Figure 2: Synthetic Gaussian with the faster spectral decay. $k=50$, $l=200$, $q=0,1$.
  • Figure 3: Synthetic Gaussian with the faster spectral decay. $k=50$, $l=200$, $q=0,1$.
  • ...and 13 more figures

Theorems & Definitions (19)

  • Definition 2.1: Canonical angles, golub2013
  • Theorem 3.1
  • Proof 1: Proof of \ref{['thm:space_agnostic_bounds']}
  • Remark 3.2: Comparison with existing probabilistic bounds
  • Proposition 4.1
  • Proof 2: Proof of \ref{['prop:space_agnostic_estimation']}
  • Remark 5.1: Generality of residual-based bounds
  • Theorem 5.2
  • Remark 5.3: Left versus right singular subspaces
  • Proof 3: Proof of \ref{['thm:with_oversmp_computable_det']}
  • ...and 9 more