Table of Contents
Fetching ...

On Quantile Randomized Kaczmarz for Linear Systems with Time-Varying Noise and Corruption

Nestor Coria, Jamie Haddock, Jaime Pacheco

TL;DR

This work studies solving large-scale linear systems $A{\bm{x}}={\bm{b}}$ when measurements are affected by time-varying noise ${\bm{n}}^{(k)}$ and corruption ${\bm{c}}^{(k)}$, observed as ${\bm{b}}^{(k)}={\bm{b}}+{\bm{n}}^{(k)}+{\bm{c}}^{(k)}$. It extends the quantile randomized Kaczmarz (QRK) method, which updates the iterate only when the residual entry lies below a quantile-based threshold, to time-varying perturbations via two practical implementations (QRK1 and QRK2). The authors prove a linear-in-k convergence bound in expectation up to a convergence horizon, characterized by a rate factor $(1-p\varphi)$ and a noise-dependent horizon term $p\gamma_k$, and they provide a Markov-inequality-based lower bound showing that the largest residual entries reveal the corrupted indices with high probability. They also derive corollaries for bounded noise, general noise with known distribution, and Gaussian noise, and validate the theory with extensive numerical experiments demonstrating robustness to time-varying corruption and detection capabilities. These results advance robust, scalable linear-algebra solvers in streaming/distributed settings and enable corruption-detection using residuals.

Abstract

Large-scale systems of linear equations arise in machine learning, medical imaging, sensor networks, and in many areas of data science. When the scale of the systems are extreme, it is common for a fraction of the data or measurements to be corrupted. The Quantile Randomized Kaczmarz (QRK) method is known to converge on large-scale systems of linear equations $A\mathbf{x}=\mathbf{b}$ that are inconsistent due to static corruptions in the measurement vector $\mathbf{b}$. We prove that QRK converges even for systems corrupted by time-varying perturbations. Additionally, we prove that QRK converges up to a convergence horizon on systems affected by time-varying noise and corruption. Finally, we utilize Markov's inequality to prove a lower bound on the probability that the largest entries of the QRK residual reveal the time-varying corruption in each iteration. We present numerical experiments which illustrate our theoretical results.

On Quantile Randomized Kaczmarz for Linear Systems with Time-Varying Noise and Corruption

TL;DR

This work studies solving large-scale linear systems when measurements are affected by time-varying noise and corruption , observed as . It extends the quantile randomized Kaczmarz (QRK) method, which updates the iterate only when the residual entry lies below a quantile-based threshold, to time-varying perturbations via two practical implementations (QRK1 and QRK2). The authors prove a linear-in-k convergence bound in expectation up to a convergence horizon, characterized by a rate factor and a noise-dependent horizon term , and they provide a Markov-inequality-based lower bound showing that the largest residual entries reveal the corrupted indices with high probability. They also derive corollaries for bounded noise, general noise with known distribution, and Gaussian noise, and validate the theory with extensive numerical experiments demonstrating robustness to time-varying corruption and detection capabilities. These results advance robust, scalable linear-algebra solvers in streaming/distributed settings and enable corruption-detection using residuals.

Abstract

Large-scale systems of linear equations arise in machine learning, medical imaging, sensor networks, and in many areas of data science. When the scale of the systems are extreme, it is common for a fraction of the data or measurements to be corrupted. The Quantile Randomized Kaczmarz (QRK) method is known to converge on large-scale systems of linear equations that are inconsistent due to static corruptions in the measurement vector . We prove that QRK converges even for systems corrupted by time-varying perturbations. Additionally, we prove that QRK converges up to a convergence horizon on systems affected by time-varying noise and corruption. Finally, we utilize Markov's inequality to prove a lower bound on the probability that the largest entries of the QRK residual reveal the time-varying corruption in each iteration. We present numerical experiments which illustrate our theoretical results.
Paper Structure (18 sections, 7 theorems, 90 equations, 7 figures, 2 algorithms)

This paper contains 18 sections, 7 theorems, 90 equations, 7 figures, 2 algorithms.

Key Result

Theorem 1.4

Let ${\bm{x}}^{(k)}$ denote the iterates of Algorithm QuantileRK1 or Algorithm QuantileRK2 applied with quantile $q$ to the system defined by $A$ and ${\bm{b}}^{(k)} = {\bm{b}} + {\bm{n}}^{(k)} + {\bm{c}}^{(k)}$, in the $k$th iteration. Assuming the setup described in Section subsec:problem, Assumpt where $p$ represents the probability that a row is selected from within the $q$-quantile in each it

Figures (7)

  • Figure 1: A linear system in which some equations (blue) have been corrupted and two iterations of a Kaczmarz method.
  • Figure 2: An illustration of the errors of the iterates ${\bm{x}}^{(k+1)}$ and $\hat{{\bm{x}}}^{(k+1)}$ produced from ${\bm{x}}^{(k)}$ on the noisy and noiseless systems, respectively.
  • Figure 3: Empirical behavior of QRK ($q = 0.6$) and upper bound provided in Theorem \ref{['thm:steinerbergerQRKwNoise']} on (left) static noise-free (${\bm{b}}^{(k)} = {\bm{b}} + {\bm{c}}$) and time-varying noise-free (${\bm{b}}^{(k)} = {\bm{b}} + {\bm{c}}^{(k)}$) systems and (right) static noisy (${\bm{b}}^{(k)} = {\bm{b}} + {\bm{c}} + {\bm{n}}$) and time-varying noisy (${\bm{b}}^{(k)} = {\bm{b}} + {\bm{c}}^{(k)} + {\bm{n}}^{(k)}$) systems defined by $A \in \mathbb{R}^{20000 \times 100}$ where corruption vectors are generated with $\beta m$ uniformly randomly selected entries equal to one and the remainder equal to 0 where $\beta = 0.001$. The noise vectors have entries sampled i.i.d. from $\mathcal{N}(0,0.001)$.
  • Figure 4: Empirical behavior of QRK and upper bound provided in Theorem \ref{['thm:steinerbergerQRKwNoise']} on system defined by $A \in \mathbb{R}^{20000 \times 100}$ and ${\bm{b}}^{(k)} = {\bm{b}} + {\bm{n}}^{(k)} + {\bm{c}}^{(k)}$ where corruption ${\bm{c}}^{(k)}$ is a vector with $\beta m$ uniformly randomly selected entries equal to the corruption size (10) and the remainder equal to 0. The noise ${\bm{n}}^{(k)}$ is an i.i.d. random vector sampled in each iteration with entries from $\mathcal{N}(0,s^2)$. The upper left shows behavior as $q$ varies between 0.5, 0.8 and 0.9, while $\beta = 0.00005$ and $s = 0.01$. The upper right shows behavior as $\beta$ varies between 0.00005, 0.0001 and 0.001, while $q = 0.8$ and $s = 0.01$. The lower center shows behavior as $s$ varies between 0.0001, 0.01 and 0.1, while $q = 0.8$ and $\beta = 0.00005$. Experiments are averaged over 10 trials.
  • Figure 5: Empirical behavior of QRK with $q = 0.8$ on system defined by $A \in \mathbb{R}^{20000 \times 100}$ and ${\bm{b}}^{(k)} = {\bm{b}} + {\bm{n}}^{(k)} + {\bm{c}}^{(k)}$ where corruption ${\bm{c}}^{(k)}$ is a vector with $\beta m$ uniformly randomly selected entries equal to the corruption size (10) and the remainder equal to 0. The noise ${\bm{n}}^{(k)}$ is an i.i.d. random vector sampled in each iteration with entries from $\mathcal{N}(0,10^{-8})$. Plots exhibit error vs iteration on systems with fraction of corruption $\beta \in \{0.1, 0.15, 0.2, 0.25\}$.
  • ...and 2 more figures

Theorems & Definitions (18)

  • Definition 1.1
  • Theorem 1.4
  • Remark
  • Corollary 1.4.1
  • Corollary 1.4.2
  • Remark
  • Theorem 2.1
  • proof
  • Lemma 2.2
  • proof
  • ...and 8 more