Table of Contents
Fetching ...

Connecting randomized iterative methods with Krylov subspaces

Yonghan Sun, Deren Han, Jiaxin Xie

TL;DR

This work builds a bridge between randomized iterative methods and Krylov subspace techniques by introducing an affine-subspace search framework that adaptively combines current iterates with a sketched normal vector. The key idea is to use a truncation parameter $\ell$ to form an affine subspace $\Pi_k$ and project toward $A^{\dagger}b$, enabling both randomized RK-like behavior ($\ell=1$) and Krylov-like behavior ($\ell=\infty$). The authors establish linear convergence in expectation, provide an efficient $O(n(k-j_k+1))$ per-iteration implementation, and introduce the iterative-sketching-based Krylov (IS-Krylov) method, which unifies various sketches and shows competitive numerical performance. Numerical experiments demonstrate that IS-Krylov-PS consistently outperforms several baselines in CPU time while maintaining similar iteration counts, highlighting scalability to large-scale problems. The work opens avenues for further exploration of optimal $\ell$ selection, connections to other randomized methods, and extensions to broader linear-algebra tasks.

Abstract

Randomized iterative methods, such as the randomized Kaczmarz method, have gained significant attention for solving large-scale linear systems due to their simplicity and efficiency. Meanwhile, Krylov subspace methods have emerged as a powerful class of algorithms, known for their robust theoretical foundations and rapid convergence properties. Despite the individual successes of these two paradigms, their underlying connection has remained largely unexplored. In this paper, we develop a unified framework that bridges randomized iterative methods and Krylov subspace techniques, supported by both rigorous theoretical analysis and practical implementation. The core idea is to formulate each iteration as an adaptively weighted linear combination of the sketched normal vector and previous iterates, with the weights optimally determined via a projection-based mechanism. This formulation not only reveals how subspace techniques can enhance the efficiency of randomized iterative methods, but also enables the design of a new class of iterative-sketching-based Krylov subspace algorithms. We prove that our method converges linearly in expectation and validate our findings with numerical experiments.

Connecting randomized iterative methods with Krylov subspaces

TL;DR

This work builds a bridge between randomized iterative methods and Krylov subspace techniques by introducing an affine-subspace search framework that adaptively combines current iterates with a sketched normal vector. The key idea is to use a truncation parameter to form an affine subspace and project toward , enabling both randomized RK-like behavior () and Krylov-like behavior (). The authors establish linear convergence in expectation, provide an efficient per-iteration implementation, and introduce the iterative-sketching-based Krylov (IS-Krylov) method, which unifies various sketches and shows competitive numerical performance. Numerical experiments demonstrate that IS-Krylov-PS consistently outperforms several baselines in CPU time while maintaining similar iteration counts, highlighting scalability to large-scale problems. The work opens avenues for further exploration of optimal selection, connections to other randomized methods, and extensions to broader linear-algebra tasks.

Abstract

Randomized iterative methods, such as the randomized Kaczmarz method, have gained significant attention for solving large-scale linear systems due to their simplicity and efficiency. Meanwhile, Krylov subspace methods have emerged as a powerful class of algorithms, known for their robust theoretical foundations and rapid convergence properties. Despite the individual successes of these two paradigms, their underlying connection has remained largely unexplored. In this paper, we develop a unified framework that bridges randomized iterative methods and Krylov subspace techniques, supported by both rigorous theoretical analysis and practical implementation. The core idea is to formulate each iteration as an adaptively weighted linear combination of the sketched normal vector and previous iterates, with the weights optimally determined via a projection-based mechanism. This formulation not only reveals how subspace techniques can enhance the efficiency of randomized iterative methods, but also enables the design of a new class of iterative-sketching-based Krylov subspace algorithms. We prove that our method converges linearly in expectation and validate our findings with numerical experiments.

Paper Structure

This paper contains 31 sections, 12 theorems, 115 equations, 6 figures, 1 table, 3 algorithms.

Key Result

Lemma 2.1

Let $S\in\mathbb{R}^{m\times q}$ be a real-valued random variable defined on a probability space $(\Omega,\mathcal{F},\mathbf{P})$. Suppose that $\mathbb{E}\left[SS^\top\right]$ is a positive definite matrix and $A\in\mathbb{R}^{m\times n}$ with $A\neq 0$. Then is well-defined and positive definite, here we define $\frac{0}{0}=0$.

Figures (6)

  • Figure 1: The figures illustrate the evolution of RSE with respect to the number of iterations (top) and the CPU time (bottom). The title of each subplot indicates the corresponding values of $\kappa$ and $r$. The other parameters are fixed as $m = 5000$, $n = 1000$, $q = 30$, and $\ell = 10$. All computations are terminated once RSE$<{10}^{-12}$.
  • Figure 2: Figures depict the evolution of RSE with respect to the number of iterations (top) and the CPU time (bottom). The title of each plot indicates the values of $\kappa$ and $r$. We set $m=256,n=128,q=30$, and $\ell=10$. All computations are terminated once the number of iterations exceeds a certain limit.
  • Figure 3: Figures depict the evolution of the number of full iterations (top) and the CPU time (bottom) with respect to the block size $q$ and the number of previous iterations $\ell$. The title of each plot indicates the values of $\kappa$ and $r$. We set $m=1024$ and $n=128$. All computations are terminated once RSE$<{10}^{-12}$.
  • Figure 4: Figures depict the evolution of RSE with respect to the number of iterations (top) and the CPU time (bottom). The title of each plot indicates the values of $\kappa$ and $r$. We set $m=10000,n=5000$, and $q=100$, and for IS-Krylov-PS, we set $\ell=50$. All computations are terminated once the number of iterations exceeds a certain limit.
  • Figure 5: Performance of RABK, SCGP, and IS-Krylov-PS for linear systems with coefficient matrices from LIBSVM chang2011libsvm. Figures depict the evolution of RSE with respect to the number of iterations and the CPU time. Each plot title indicates the dataset name and data dimensions. We set $q=300$ and stop the algorithms if the number of iterations exceeds a certain limit.
  • ...and 1 more figures

Theorems & Definitions (25)

  • Lemma 2.1: lorenz2023minimal, Lemma 2.3
  • Lemma 2.2: zeng2024adaptive, Lemma 2.3
  • Lemma 2.3: zeng2024adaptive, Lemma 2.4
  • Lemma 2.4: rieger2023generalized, Lemma 13
  • Proposition 3.1
  • proof
  • Remark 3.2
  • Theorem 3.3
  • proof
  • Remark 3.4
  • ...and 15 more