Table of Contents
Fetching ...

Over-Relaxation in Alternating Projections

Alireza Entezari, Arunava Banerjee

TL;DR

This paper addresses convergence rate bounds for Gauss-Seidel, Kaczmarz, and other projection methods under randomized order, and proposes a practical over-relaxation strategy. It introduces a covariance-based view of the iterates, where the error covariance $\boldsymbol{\Sigma}_k$ evolves linearly via a superoperator $\boldsymbol{\mathcal A}$, making the asymptotic rate $\rho(\omega)$ equal to the spectral radius $\lambda_{\max}(\boldsymbol{\mathcal A})$. To bound this rate, the authors derive the C-bound, a closed-form bound depending on the first two eigenvalues $\mu_1, \mu_2$ and a fourth-order term $\xi$ of the expected projector, constructing a surrogate $\boldsymbol{\mathcal C}^\star$ that eclipses all admissible surrogates. The resulting bound implies an optimal over-relaxation parameter $\omega^\star \in [1,2)$ that guarantees faster convergence than $\omega=1$, with tighter performance guarantees than the conventional B-bound. The framework connects spectral properties of $\boldsymbol{A}$ to the stochastic behavior of randomized projections, offering practical guidance for accelerating convergence in large-scale linear systems and related imaging and learning tasks.

Abstract

We improve upon the current bound on convergence rates of the Gauss-Seidel, Kaczmarz, and more generally projection methods where projections are visited in randomized order. The tighter bound reveals a practical approach to speed up convergence by over-relaxation -- a longstanding challenge that has been difficult to overcome for general problems with deterministic Succession of Over-Relaxations.

Over-Relaxation in Alternating Projections

TL;DR

This paper addresses convergence rate bounds for Gauss-Seidel, Kaczmarz, and other projection methods under randomized order, and proposes a practical over-relaxation strategy. It introduces a covariance-based view of the iterates, where the error covariance evolves linearly via a superoperator , making the asymptotic rate equal to the spectral radius . To bound this rate, the authors derive the C-bound, a closed-form bound depending on the first two eigenvalues and a fourth-order term of the expected projector, constructing a surrogate that eclipses all admissible surrogates. The resulting bound implies an optimal over-relaxation parameter that guarantees faster convergence than , with tighter performance guarantees than the conventional B-bound. The framework connects spectral properties of to the stochastic behavior of randomized projections, offering practical guidance for accelerating convergence in large-scale linear systems and related imaging and learning tasks.

Abstract

We improve upon the current bound on convergence rates of the Gauss-Seidel, Kaczmarz, and more generally projection methods where projections are visited in randomized order. The tighter bound reveals a practical approach to speed up convergence by over-relaxation -- a longstanding challenge that has been difficult to overcome for general problems with deterministic Succession of Over-Relaxations.

Paper Structure

This paper contains 18 sections, 13 theorems, 53 equations, 6 figures.

Key Result

Theorem 1

Iterations in (eq:iter) converge according to where $\lVert\boldsymbol{\varepsilon}_k\rVert = \lVert{{\bf x}}_k - {{\bf x}}_\star\rVert_A$ measures the error in the ${\boldsymbol{A}}$-induced norm for Gauss-Seidel and the standard Euclidean norm $\lVert\boldsymbol{\varepsilon}_k\rVert = \lVert{{\bf x}}_k - {{\bf x}}_\star\rVert$ for Kaczmarz. T

Figures (6)

  • Figure 1: Alternating Projections: At each step, one of the $m=2$ projections is chosen, at random. Left figure shows the trajectory for $\omega = 1$ where we, greedily, minimize the error in each step. The right figure shows the over-relaxation trajectory for $\omega = 1.5$ exhibiting faster convergence. The method of Successive Over-Relaxations (SOR) was devised by David Young in 1950 improving convergence rate of Gauss-Seidel for diagonally dominant matrices appearing in elliptic PDE problems. This paper shows that by randomizing the order, instead of cycles with a pre-determined order, the over-relaxation phenomenon occurs in general problems, and can be leveraged to speed up convergence.
  • Figure 2: Left: Empirical study of a toy problem with $m=n=5$: instead of forward/backward sweeps, at each step of (\ref{['eq:iter']}), $i$ was randomly chosen from $\{1,\dots, 5\}$ with uniform probability. Blue visualizes the errors (in log scale) of 150 trials for 1000 iterations without relaxation ($\omega=1$). The solid blue shows the mean of trials whose slope depicts the empirical rate of convergence. The dashed black line, we call B-bound, shows the current bound, to the solid blue, obtained from a Markovian analysis of the error norm, which is well known to be loose in practice. The dashed blue shows the new bound, we call the C-bound, that traces the mean (solid blue) closely. Green plots show the improved performance due to over-relaxation resulting from our analysis. Right: $\rho(\omega)$ shows the true convergence rate determined from the spectral radius of a certain superoperator --- playing the role of the iteration matrix --- as a function of the relaxation parameter. The B-bound, $B(\omega)$, is based on the smallest singular value of ${\boldsymbol{A}}$, and the C-bound, $C(\omega)$, is based on the two smallest singular values that provide bounds for $\rho(\omega)$. Our analysis shows the improvement in the C-bound and the resulting over-relaxation depends on the gap between the two smallest singular values and a geometric measure of orthogonality of ${\boldsymbol{A}}$.
  • Figure 3: Upper bounds on the spectral radius of $\boldsymbol{\mathcal{A}}$ (left) correspond to lower bounds on the spectral gap of $\boldsymbol{\mathcal{B}} - \omega \boldsymbol{\mathcal{C}}$ (right).
  • Figure 4: Geometric view of the smallest eigenvalue $\lambda_1(\boldsymbol{\mathcal{B}} - \omega \boldsymbol{\mathcal{C}})$ and the associated eigenvector $\boldsymbol{V}_1(\boldsymbol{\mathcal{B}} - \omega \boldsymbol{\mathcal{C}})$ along which the distance between $\boldsymbol{\mathcal{B}}$ and $\omega\boldsymbol{\mathcal{C}}$ is minimized. The eigenvector is shown for $\omega = 1/2$ (left), $\omega=1$ (middle) and $\omega=2$ (right).
  • Figure 5: The C-bound's effectiveness, and the resulting over-relaxation, is more significant for ill-conditioned problems. Random matrices: $50 \times 50$ on the left with condition number $\kappa({\boldsymbol{A}}) = 61705.3$ and $75 \times 50$ on the right with condition number $\kappa({\boldsymbol{A}}) = 110.2$.
  • ...and 1 more figures

Theorems & Definitions (25)

  • Theorem 1: The C-bound
  • Lemma 1: Loewner ordering of superoperators: $\boldsymbol{\mathcal{B}} \succcurlyeq 2\boldsymbol{\mathcal{C}}$
  • proof
  • Lemma 2: Every over-relaxation is better than the corresponding underrelaxation
  • proof
  • Lemma 3: The B-bound
  • proof
  • Lemma 4: Concavity and Convexity
  • proof
  • Lemma 5: Derivatives with respect to $\omega$
  • ...and 15 more