Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations
Yijun Dong, Per-Gunnar Martinsson, Yuji Nakatsukasa
TL;DR
This work analyzes the accuracy of singular vectors from randomized SVD in terms of canonical angles between true and computed subspaces. It introduces space-agnostic probabilistic bounds that remain valid without knowledge of the true subspaces, and shows these bounds are asymptotically tight under moderate oversampling when $l=\Omega(k)$; it also provides unbiased estimators and posterior residual-based guarantees that depend on the computed residuals and spectra. Through extensive numerical experiments on synthetic and real matrices, the authors demonstrate when space-agnostic bounds or posterior bounds dominate and how oversampling and power iterations interact under a fixed computational budget. The results offer practical guidance for selecting oversampling and power-iteration parameters and contribute efficient tools for certifying subspace accuracy in large-scale randomized subspace approximations.
Abstract
Randomized subspace approximation with "matrix sketching" is an effective approach for constructing approximate partial singular value decompositions (SVDs) of large matrices. The performance of such techniques has been extensively analyzed, and very precise estimates on the distribution of the residual errors have been derived. However, our understanding of the accuracy of the computed singular vectors (measured in terms of the canonical angles between the spaces spanned by the exact and the computed singular vectors, respectively) remains relatively limited. In this work, we present practical bounds and estimates for canonical angles of randomized subspace approximation that can be computed efficiently either a priori or a posteriori, without assuming prior knowledge of the true singular subspaces. Under moderate oversampling in the randomized SVD, our prior probabilistic bounds are asymptotically tight and can be computed efficiently, while bringing a clear insight into the balance between oversampling and power iterations given a fixed budget on the number of matrix-vector multiplications. The numerical experiments demonstrate the empirical effectiveness of these canonical angle bounds and estimates on different matrices under various algorithmic choices for the randomized SVD.
