Table of Contents
Fetching ...

Randomized algorithms for low-rank matrix approximation: Design, analysis, and applications

Joel A. Tropp, Robert J. Webber

TL;DR

This survey compares randomized low-rank matrix approximation methods—RSVD, RSI, and RBKI—and their Nyström variants, tying algorithmic choices to the singular-value structure of the input and available resources. It delivers new, explicit error bounds, improved RBKI pseudocode, and practical recommendations, including novel NysBKI variants for psd matrices. The work demonstrates that RSVD and NysSVD excel on rapidly decaying spectra, while RBKI and NysBKI offer superior accuracy and efficiency on challenging, slowly decaying spectra, with impactful applications in genetics PCA and kernel spectral clustering for molecular dynamics. Collectively, the results support broader adoption of Krylov-based randomized methods in computational science by providing clear guidelines, robust theory, and scalable implementations.

Abstract

This survey explores modern approaches for computing low-rank approximations of high-dimensional matrices by means of the randomized SVD, randomized subspace iteration, and randomized block Krylov iteration. The paper compares the procedures via theoretical analyses and numerical studies to highlight how the best choice of algorithm depends on spectral properties of the matrix and the computational resources available. Despite superior performance for many problems, randomized block Krylov iteration has not been widely adopted in computational science. The paper strengthens the case for this method in three ways. First, it presents new pseudocode that can significantly reduce computational costs. Second, it provides a new analysis that yields simple, precise, and informative error bounds. Last, it showcases applications to challenging scientific problems, including principal component analysis for genetic data and spectral clustering for molecular dynamics data.

Randomized algorithms for low-rank matrix approximation: Design, analysis, and applications

TL;DR

This survey compares randomized low-rank matrix approximation methods—RSVD, RSI, and RBKI—and their Nyström variants, tying algorithmic choices to the singular-value structure of the input and available resources. It delivers new, explicit error bounds, improved RBKI pseudocode, and practical recommendations, including novel NysBKI variants for psd matrices. The work demonstrates that RSVD and NysSVD excel on rapidly decaying spectra, while RBKI and NysBKI offer superior accuracy and efficiency on challenging, slowly decaying spectra, with impactful applications in genetics PCA and kernel spectral clustering for molecular dynamics. Collectively, the results support broader adoption of Krylov-based randomized methods in computational science by providing clear guidelines, robust theory, and scalable implementations.

Abstract

This survey explores modern approaches for computing low-rank approximations of high-dimensional matrices by means of the randomized SVD, randomized subspace iteration, and randomized block Krylov iteration. The paper compares the procedures via theoretical analyses and numerical studies to highlight how the best choice of algorithm depends on spectral properties of the matrix and the computational resources available. Despite superior performance for many problems, randomized block Krylov iteration has not been widely adopted in computational science. The paper strengthens the case for this method in three ways. First, it presents new pseudocode that can significantly reduce computational costs. Second, it provides a new analysis that yields simple, precise, and informative error bounds. Last, it showcases applications to challenging scientific problems, including principal component analysis for genetic data and spectral clustering for molecular dynamics data.
Paper Structure (47 sections, 17 theorems, 178 equations, 14 figures, 10 algorithms)

This paper contains 47 sections, 17 theorems, 178 equations, 14 figures, 10 algorithms.

Key Result

Lemma 5.1

\newlabellem:bigger_projection0 Consider a matrix $\bm{A} \in \mathbb{R}^{L \times N}$ and orthogonal projections $\bm{P}_1, \bm{P}_2 \in \mathbb{R}^{L \times L}$ such that $\textup{range}(\bm{P}_1) \subseteq \textup{range}(\bm{P}_2)$. Then, for any Schatten $p$-norm with $1 \leq p \leq \infty$. The same inequality holds for any unitarily invariant norm.

Figures (14)

  • Figure 1: (Singular value decay profiles). Fast versus slow singular value decay; for details of the matrices, see \ref{['sec:illustrative']}.
  • Figure 1: Runtime comparisons. Eigenvector approximation error of NysBKI with block size $k = 1$, $2$, $5$, $10$, or $100$ applied to the $250{,}000 \times 250{,}000$ kernel matrix, as constructed in \ref{['sec:clustering']}. The left panel measures the number of matrix--vector products while the right panel measures the runtime. \newlabelfig:runtime0
  • Figure 1: (Fast vs. slow singular value decay). Fast singular value decay of the matrix $\bm{A}$ versus slow singular value decay of the matrix $\bm{B}$, as constructed in \ref{['sec:comparison']}.
  • Figure 1: (Gapless vs. gapped bounds). Gapless (left) versus gapped (right) error bounds on the log error ratio $\log \bigl(\operatorname{\mathbb{E}} \lVert \bm{B} - \hat{\bm{B}} \rVert^2 / \sigma_{r+1}(\bm{B})^2\bigr)$ when approximating the matrix $\bm{B}$ from \ref{['sec:comparison']} with block size $k = 100$ and target rank $r = 75$.
  • Figure 2: (Accuracy of the singular vectors). Singular vector approximation error for the clean matrix $\bm{A}$ (blue) and the noisy matrix $\bm{B}$ (orange), as constructed in \ref{['sec:illustrative']}. \newlabelfig:decaying_accuracy0
  • ...and 9 more figures

Theorems & Definitions (35)

  • Remark 1.1: Finding structure with randomness
  • Remark 2.1: Matrix multiplication
  • Proof 1
  • Lemma 5.2: Nyström helps
  • Proof 2
  • Remark 5.3: Column Nyström approximation
  • Theorem 8.1: RSVD error
  • Corollary 8.2: NysSVD error
  • Lemma 8.3: Diagonal, psd reduction
  • Proof 3
  • ...and 25 more