Table of Contents
Fetching ...

Quantile Randomized Kaczmarz Algorithm with Whitelist Trust Mechanism

Sofiia Shvaiko, Longxiu Huang, Elizaveta Rebrova

TL;DR

The paper tackles robustly solving overdetermined linear systems ${\mathbf{A}{\mathbf{x}}^*={\mathbf{b}}}$ when observed labels are corrupted as $\tilde{\mathbf{b}}={\mathbf{b}}+{\boldsymbol{\varepsilon}}$ with $\|{\boldsymbol{\varepsilon}}\|_0\le \beta m$. It reanalyzes QuantileRK (QRK) and introduces WhiteList QuantileRK (WL-QRK), a lightweight online detector with a whitelist/blocklist mechanism that screens rows using two residual thresholds and reintroduces rows when trustworthy, while subsampling residuals to reduce per-iteration cost to $\mathcal{O}(n)+\mathcal{O}(t)$ with $t\ll m$. Theoretical results include a refined QRK convergence rate that improves as $\beta$ decreases and a residual-based identifiability lemma showing that top residuals eventually concentrate on corrupted equations. Empirically, WL-QRK outperforms RK and QRK on synthetic and real imaging data, including tomography and Wisconsin Breast Cancer problems, demonstrating faster convergence and robustness to sparse large-scale corruptions. This work thus offers a practical, scalable approach to robust linear solving in the presence of adversarial row-wise noise.

Abstract

Randomized Kaczmarz (RK) is a simple and fast solver for consistent overdetermined systems, but it is known to be fragile under noise. We study overdetermined $m\times n$ linear systems with a sparse set of corrupted equations, $ {\bf A}{\bf x}^\star = {\bf b}, $where only $\tilde{\bf b} = {\bf b} + \boldsymbol{\varepsilon}$ is observed with $\|\boldsymbol{\varepsilon}\|_0 \le βm$. The recently introduced QuantileRK (QRK) algorithm addresses this issue by testing residuals against a quantile threshold, but computing a per-iteration quantile across many rows is costly. In this work we (i) reanalyze QRK and show that its convergence rate improves monotonically as the corruption fraction $β$ decreases; (ii) propose a simple online detector that flags and removes unreliable rows, which reduces the effective $β$ and speeds up convergence; and (iii) make the method practical by estimating quantiles from a small random subsample of rows, preserving robustness while lowering the per-iteration cost. Simulations on imaging and synthetic data demonstrate the efficiency of the proposed method.

Quantile Randomized Kaczmarz Algorithm with Whitelist Trust Mechanism

TL;DR

The paper tackles robustly solving overdetermined linear systems when observed labels are corrupted as with . It reanalyzes QuantileRK (QRK) and introduces WhiteList QuantileRK (WL-QRK), a lightweight online detector with a whitelist/blocklist mechanism that screens rows using two residual thresholds and reintroduces rows when trustworthy, while subsampling residuals to reduce per-iteration cost to with . Theoretical results include a refined QRK convergence rate that improves as decreases and a residual-based identifiability lemma showing that top residuals eventually concentrate on corrupted equations. Empirically, WL-QRK outperforms RK and QRK on synthetic and real imaging data, including tomography and Wisconsin Breast Cancer problems, demonstrating faster convergence and robustness to sparse large-scale corruptions. This work thus offers a practical, scalable approach to robust linear solving in the presence of adversarial row-wise noise.

Abstract

Randomized Kaczmarz (RK) is a simple and fast solver for consistent overdetermined systems, but it is known to be fragile under noise. We study overdetermined linear systems with a sparse set of corrupted equations, where only is observed with . The recently introduced QuantileRK (QRK) algorithm addresses this issue by testing residuals against a quantile threshold, but computing a per-iteration quantile across many rows is costly. In this work we (i) reanalyze QRK and show that its convergence rate improves monotonically as the corruption fraction decreases; (ii) propose a simple online detector that flags and removes unreliable rows, which reduces the effective and speeds up convergence; and (iii) make the method practical by estimating quantiles from a small random subsample of rows, preserving robustness while lowering the per-iteration cost. Simulations on imaging and synthetic data demonstrate the efficiency of the proposed method.
Paper Structure (4 sections, 2 theorems, 9 equations, 4 figures, 2 algorithms)

This paper contains 4 sections, 2 theorems, 9 equations, 4 figures, 2 algorithms.

Key Result

Theorem 3.1

Under the Assumption incoherent, with $q=1 - \beta - \alpha$, $\alpha, \beta \in (0, 1)$, alg:QRK on the full residual has a (detailed) convergence rate

Figures (4)

  • Figure 1: The convergence rate in Theorem \ref{['thm:QRKrate']} improves as $\beta$ decreases (illustrated with $C_D = 1$, $n = 10$, $\alpha = 0.05$).
  • Figure 2: Subsample WL-QRK converges quicker than subsample QRK for various corruption types, RK does not converge.
  • Figure 3: Subsample WL-QRK performs similarly to the full sample WL-QRK (Left) and the effective corruption rate decreases as the number of blocking cycles grows (Right).
  • Figure 4: Full sample WL-QRK v.s. QRK on Tomography system (Left) and WBC system (Right).

Theorems & Definitions (3)

  • Theorem 3.1
  • Lemma 3.2
  • proof