Table of Contents
Fetching ...

Improving Sketching Algorithms for Low-Rank Matrix Approximation via Sketch-Power Iterations

Chao Chang, Yuning Yang

Abstract

Power iteration can improve the accuracy of randomized SVD, but requires multiple data passes, making it impractical in streaming or memory-constrained settings. We introduce a lightweight yet effective sketch-power iteration, allowing power-like iterations with only a single pass of the data, which can be incorporated into one-pass algorithms for low-rank approximation. As an example, we integrate the sketch-power iteration into a one-pass algorithm proposed by Tropp et al., and introduce strategies to reduce its storage cost. We establish meaningful error bounds: given a fixed storage budget, the sketch sizes derived from the bounds closely match the optimal ones observed in reality. This allows one to preselect reasonable parameters. Numerical experiments on both synthetic and real-world datasets indicate that, under the same storage constraints, applying one or two sketch-power iterations can substantially improve the approximation accuracy of the considered one-pass algorithms. In particular, experiments on real data with flat spectrum show that the method can approximate the dominant singular vectors well.

Improving Sketching Algorithms for Low-Rank Matrix Approximation via Sketch-Power Iterations

Abstract

Power iteration can improve the accuracy of randomized SVD, but requires multiple data passes, making it impractical in streaming or memory-constrained settings. We introduce a lightweight yet effective sketch-power iteration, allowing power-like iterations with only a single pass of the data, which can be incorporated into one-pass algorithms for low-rank approximation. As an example, we integrate the sketch-power iteration into a one-pass algorithm proposed by Tropp et al., and introduce strategies to reduce its storage cost. We establish meaningful error bounds: given a fixed storage budget, the sketch sizes derived from the bounds closely match the optimal ones observed in reality. This allows one to preselect reasonable parameters. Numerical experiments on both synthetic and real-world datasets indicate that, under the same storage constraints, applying one or two sketch-power iterations can substantially improve the approximation accuracy of the considered one-pass algorithms. In particular, experiments on real data with flat spectrum show that the method can approximate the dominant singular vectors well.

Paper Structure

This paper contains 70 sections, 25 theorems, 137 equations, 18 figures, 6 tables, 6 algorithms.

Key Result

Theorem 3.1

\newlabelthm:oblique_proj_error_q=10 Let $\mathbf{ Q}\in \mathbb{R} ^{m\times s},\mathbf{ B}\in \mathbb{R} ^{s\times n}$ be generated by Algorithm alg:psa-sps with $q=1$ and $\boldsymbol{\Phi} \in \mathbb{R} ^{n\times l}, \boldsymbol{\Omega} \in \mathbb{R} ^{n\times s}, \boldsymbol{\Psi} where $\epsilon=\frac{2\varrho }{(l-\varrho -1)}$, and $E_F=\left\{ \boldsymbol{\Omega} , \bo

Figures (18)

  • Figure 1: Figures of synthetic data. $x$-axis means the storage budget $\hat{T} n$ (parameterized by $\hat{T}$). $y$-axis means the relative Frobenius error $S_F$. All dashed lines represent oracle errors, while the markers indicate errors calculated with parameter guidance. All results are averaged by 20 independent repeated experiments.
  • Figure 1: Comparison of generating $\mathbf{ Q}$ by SPI (left) and by ordinary method (right) in mixed-precision. Subscripts "$l$" and "$h$" denote lower and higher precision. SPI exploits $\mathbf{ Z}$'s storage to convert from low to high precision. The right part shows that directly obtaining a higher precision $\mathbf{ Q}$ form a lower precision $\mathbf{ Y}$ is impossible due to storage dudget constraint.
  • Figure 2: Figures of synectic data. $y$-axis means the relative Spectral error $S_{\infty}$.
  • Figure 3: Distortion ratio $\sigma_i(\mathbf{ A} \boldsymbol{\Phi} )/\sigma_i(\mathbf{ A})$ for $r=10$ and $s=20$. The $x$-axis is the singular value index.
  • Figure 4: Error comparison of different $q$. The scatter points represent the results of individual experiments and the shaded areas represent the regions between the minimum and maximum errors. The dashed lines represent the average result across all $50$ experiments.
  • ...and 13 more figures

Theorems & Definitions (54)

  • Remark 2.1
  • Theorem 3.1
  • Theorem 3.2
  • Example 4.1
  • Example 4.2
  • Example 4.3
  • Example 4.4
  • Example 4.5
  • Example 4.6
  • Lemma 5.1: Parallel sums tropp2023RandomizedAlgorithms
  • ...and 44 more