Table of Contents
Fetching ...

Does block size matter in randomized block Krylov low-rank approximation?

Tyler Chen, Ethan N. Epperly, Raphael A. Meyer, Christopher Musco, Akash Rao

TL;DR

This paper resolves a long-standing gap between theory and practice for randomized block Krylov iteration (RBKI) in low-rank approximation. It proves a unified, gap-independent bound showing that RBKI with any block size $1\le b\le k$ achieves a $(1+\varepsilon)$-approximation to the top-$k$ components using $\tilde{O}(k/\sqrt{\varepsilon})$ matrix-vector products, with high probability. The key technical contribution is a sharp bound on the minimum singular value of a random block Krylov matrix, built via a Vandermonde-form reduction, anti-concentration (Nie 2022) arguments, and a PV-style decomposition, which may be of independent interest beyond LRA. The results justify using intermediate block sizes in practice, offer gap-dependent refinements, and connect to smoothed-analysis ideas to guarantee well-conditioned behavior in realistic settings, potentially accelerating large-scale matrix computations and related sparse linear-system solvers.

Abstract

We study the problem of computing a rank-$k$ approximation of a matrix using randomized block Krylov iteration. Prior work has shown that, for block size $b = 1$ or $b = k$, a $(1 + \varepsilon)$-factor approximation to the best rank-$k$ approximation can be obtained after $\tilde O(k/\sqrt{\varepsilon})$ matrix-vector products with the target matrix. On the other hand, when $b$ is between $1$ and $k$, the best known bound on the number of matrix-vector products scales with $b(k-b)$, which could be as large as $O(k^2)$. Nevertheless, in practice, the performance of block Krylov methods is often optimized by choosing a block size $1 \ll b \ll k$. We resolve this theory-practice gap by proving that randomized block Krylov iteration produces a $(1 + \varepsilon)$-factor approximate rank-$k$ approximation using $\tilde O(k/\sqrt{\varepsilon})$ matrix-vector products for any block size $1\le b\le k$. Our analysis relies on new bounds for the minimum singular value of a random block Krylov matrix, which may be of independent interest. Similar bounds are central to recent breakthroughs on faster algorithms for sparse linear systems [Peng & Vempala, SODA 2021; Nie, STOC 2022].

Does block size matter in randomized block Krylov low-rank approximation?

TL;DR

This paper resolves a long-standing gap between theory and practice for randomized block Krylov iteration (RBKI) in low-rank approximation. It proves a unified, gap-independent bound showing that RBKI with any block size achieves a -approximation to the top- components using matrix-vector products, with high probability. The key technical contribution is a sharp bound on the minimum singular value of a random block Krylov matrix, built via a Vandermonde-form reduction, anti-concentration (Nie 2022) arguments, and a PV-style decomposition, which may be of independent interest beyond LRA. The results justify using intermediate block sizes in practice, offer gap-dependent refinements, and connect to smoothed-analysis ideas to guarantee well-conditioned behavior in realistic settings, potentially accelerating large-scale matrix computations and related sparse linear-system solvers.

Abstract

We study the problem of computing a rank- approximation of a matrix using randomized block Krylov iteration. Prior work has shown that, for block size or , a -factor approximation to the best rank- approximation can be obtained after matrix-vector products with the target matrix. On the other hand, when is between and , the best known bound on the number of matrix-vector products scales with , which could be as large as . Nevertheless, in practice, the performance of block Krylov methods is often optimized by choosing a block size . We resolve this theory-practice gap by proving that randomized block Krylov iteration produces a -factor approximate rank- approximation using matrix-vector products for any block size . Our analysis relies on new bounds for the minimum singular value of a random block Krylov matrix, which may be of independent interest. Similar bounds are central to recent breakthroughs on faster algorithms for sparse linear systems [Peng & Vempala, SODA 2021; Nie, STOC 2022].

Paper Structure

This paper contains 24 sections, 14 theorems, 69 equations, 6 figures, 1 algorithm.

Key Result

Theorem 1.3

Fix a rank $k$ and block size $1\le b \le k$. alg:block_krylov_LRA solves problem:LRA with probability at least $0.99$ with total matrix-vector complexity $bq = \widetilde{\mathcal{O}}(k/\sqrt{\varepsilon})$.

Figures (6)

  • Figure 1: Accuracy versus cost to compute a rank $k=200$ approximation for a $2000\times 2000$ dense matrix $\mathbf{A}\xspace$ using \ref{['alg:block_krylov_LRA']}. See \ref{['sec:more-figs']} for details. While smaller block sizes perform best in terms of matrix-vector products, the fastest choice in terms of wall-clock time is an intermediate block size, $b=20$. Prior to our work, theoretical guarantees for intermediate block sizes lagged behind those for $b=1$ or $k$.
  • Figure 2: Spectra for the \ref{['eq:intro']}, \ref{['eq:doubles']}, \ref{['eq:fastdecay']}, and \ref{['eq:slowdecay']} problems.
  • Figure 3: Convergence of \ref{['alg:block_krylov_LRA']} on \ref{['eq:intro']} problem for several values of $k$.
  • Figure 4: Convergence of \ref{['alg:block_krylov_LRA']} on \ref{['eq:doubles']} problem for several values of $k$.
  • Figure 5: Convergence of \ref{['alg:block_krylov_LRA']} on \ref{['eq:fastdecay']} problem for several values of $k$.
  • ...and 1 more figures

Theorems & Definitions (26)

  • Theorem 1.3: RBKI with any block size, gap-independent bounds, informal version
  • Theorem 1.4: Random block Krylov matrices are not too ill-conditioned, informal version
  • Proposition 2.3: Gaussian anti-concentration
  • Definition 3.1: Good starting block
  • Lemma 3.4: From minimum singular value to $(k,L)$-good
  • proof : Proof of \ref{['lem:kl-good-to-min-sing-val-square']}
  • Lemma 3.5: Smaller rank is easier
  • proof
  • Remark 3.6: Additive and relative gaps
  • Theorem 3.8: RBKI with any block size, gap-dependent bounds
  • ...and 16 more