Table of Contents
Fetching ...

A subspace constrained randomized Kaczmarz method for structure or external knowledge exploitation

Jackie Lok, Elizaveta Rebrova

TL;DR

It is shown that the subspace constraint leads to an accelerated convergence rate, especially when the system has approximately low-rank structure, and on Gaussian-like random data, it results in a form of dimension reduction that effectively increases the aspect ratio of the system.

Abstract

We study a version of the randomized Kaczmarz algorithm for solving systems of linear equations where the iterates are confined to the solution space of a selected subsystem. We show that the subspace constraint leads to an accelerated convergence rate, especially when the system has approximately low-rank structure. On Gaussian-like random data, we show that it results in a form of dimension reduction that effectively increases the aspect ratio of the system. Furthermore, this method serves as a building block for a second, quantile-based algorithm for solving linear systems with arbitrary sparse corruptions, which is able to efficiently utilize external knowledge about corruption-free equations and achieve convergence in difficult settings. Numerical experiments on synthetic and realistic data support our theoretical results and demonstrate the validity of the proposed methods for even more general data models than guaranteed by the theory.

A subspace constrained randomized Kaczmarz method for structure or external knowledge exploitation

TL;DR

It is shown that the subspace constraint leads to an accelerated convergence rate, especially when the system has approximately low-rank structure, and on Gaussian-like random data, it results in a form of dimension reduction that effectively increases the aspect ratio of the system.

Abstract

We study a version of the randomized Kaczmarz algorithm for solving systems of linear equations where the iterates are confined to the solution space of a selected subsystem. We show that the subspace constraint leads to an accelerated convergence rate, especially when the system has approximately low-rank structure. On Gaussian-like random data, we show that it results in a form of dimension reduction that effectively increases the aspect ratio of the system. Furthermore, this method serves as a building block for a second, quantile-based algorithm for solving linear systems with arbitrary sparse corruptions, which is able to efficiently utilize external knowledge about corruption-free equations and achieve convergence in difficult settings. Numerical experiments on synthetic and realistic data support our theoretical results and demonstrate the validity of the proposed methods for even more general data models than guaranteed by the theory.
Paper Structure (29 sections, 20 theorems, 87 equations, 7 figures, 2 algorithms)

This paper contains 29 sections, 20 theorems, 87 equations, 7 figures, 2 algorithms.

Key Result

Theorem 1.1

Suppose that the rows of $\mathbf{A}$ are partitioned into two blocks $\mathbf{A}_{I_0}$ and $\mathbf{A}_{I_1}$ of sizes $m_0$ and $m - m_0$ respectively. Let $\mathbf{P} = \mathbf{I} - \mathbf{A}_{I_0}^{\dagger} \mathbf{A}_{I_0}$ be the orthogonal projector onto $\operatorname{Null}(\mathbf{A}_{I_0

Figures (7)

  • Figure 3.1: SCRK update from the current iterate $\mathbf{x}^k$ for reaching the vector $\mathbf{x}^{k+1}$ in the solution space $\mathcal{H}_{\{j\}} = \{ \mathbf{x} \in \mathbb{R}^n : \mathbf{a}_j^{{\mkern-1.5mu\mathsf{T}}} \mathbf{x} = b_j \}$ whilst remaining within $\mathcal{H}_{I_0} = \{ \mathbf{x} \in \mathbb{R}^n : \mathbf{A}_{I_0} \mathbf{x} = \mathbf{b}_{I_0} \}$, compared to the RK update for reaching $\mathbf{x}^{k+1}_{\mathrm{RK}}$ alone.
  • Figure 5.1: Performance of SCRK on a system with highly correlated rows for various sizes $m_0$ of $\mathbf{A}_{I_0}$. (Left) Log relative error at each iteration. (Right) Log relative error against time elapsed, including the initial cost of precomputing $\mathbf{A}_{I_0}^{\dagger}$ for each $m_0$. The time taken to reach a log relative error of less than $-8$ is reported in brackets (N/A indicates that this was not reached in 30 seconds).
  • Figure 5.2: Performance of SCRK on a coherent system with low-rank structure using a "perfect" block (with $m_0 = 20$) and a randomly sampled block (with $m_0 = 100$) as described in the main text. The two-subspace Kaczmarz method NeedellWard2013 and randomized block Kaczmarz method NeedellTropp2014 (with two block sizes) are also included. (Left) Log relative error at each iteration. (Right) Log relative error against time elapsed, not including the initial costs of precomputing pseudoinverses for SCRK and block Kaczmarz. The time taken to reach a log relative error of less than $-8$ is reported in the brackets (N/A indicates that this was not reached in 30 seconds).
  • Figure 5.3: Convergence paths for the SCRK (with $I_0$ equal to the first $m_0 = 25$ rows) and RK methods on a noisy system. The dashed/dotted lines indicate the predicted error horizons $\gamma_0 + \gamma_1$ from Theorem \ref{['thm:noisy_scrk_convergence']} and $\gamma = \lVert \mathbf{r} \rVert^2 / \sigma_{\mathrm{min}}(\mathbf{A})^2$ from Needell2010 respectively.
  • Figure 5.4: Performance of the QuantileSCRK method, given a corruption-free block of size $m_0$, compared to the QuantileRK method HaNeReSw2022 on Gaussian systems with different aspect ratios and $c$ corrupted measurements. (Left) Log relative error after $k$ iterations for various values of the quantile parameter $q$. (Right) Convergence paths using the best quantile parameters $q_{\mathrm{RK}}$ and $q_{\mathrm{SCRK}}$.
  • ...and 2 more figures

Theorems & Definitions (46)

  • Theorem 1.1
  • Corollary 1.2
  • Remark 1.3: Per-iteration complexity
  • Theorem 1.4: Simplified version of Theorem \ref{['thm:quantilescrk_convergence']}
  • Remark 1.5
  • Lemma 3.1
  • proof
  • Remark 3.2
  • proof : Proof of Theorem \ref{['thm:scrk_convergence']}
  • Remark 3.3
  • ...and 36 more