Table of Contents
Fetching ...

Conjugate Direction Methods Under Inconsistent Systems

Alexander Lim, Yang Liu, Fred Roosta

TL;DR

The paper studies CG and CR for symmetric inconsistent systems, formalizing difficulties in recovering normal solutions and introducing CG_pis to obtain the pseudo-inverse solution x^+ = A^+ b. It shows that CR can converge to a normal solution under no unlucky breakdown and is essentially equivalent to MINRES in this setting, with a simple CR_pis variant to reach the pseudo-inverse solution. A unifying framework clarifies that many properties of CD methods extend beyond the PD case, and extensive numerical experiments (synthetic and real-world) reveal that CG can be catastrophically unstable under inconsistency while CR and MINRES remain comparatively stable and effective. The findings have practical impact for choosing solvers in applications such as PDEs, image processing, and kernel methods where inconsistency or near-singularity arises.

Abstract

Since the development of the conjugate gradient (CG) method in 1952 by Hestenes and Stiefel, CG, has become an indispensable tool in computational mathematics for solving positive definite linear systems. On the other hand, the conjugate residual (CR) method, closely related CG and introduced by Stiefel in 1955 for the same settings, remains relatively less known outside the numerical linear algebra community. Since their inception, these methods -- henceforth collectively referred to as conjugate direction methods -- have been extended beyond positive definite to indefinite, albeit consistent, settings. Going one step further, in this paper, we investigate the theoretical and empirical properties of these methods under inconsistent systems. Among other things, we show that small modifications to the original algorithms allow for the pseudo-inverse solution. Furthermore, we show that CR is essentially equivalent to the minimum residual method, proposed by Paige and Saunders in 1975, in such contexts. Lastly, we conduct a series of numerical experiments to shed lights on their numerical stability (or lack thereof) and their performance for inconsistent systems. Surprisingly, we will demonstrate that, unlike CR and contrary to popular belief, CG can exhibit significant numerical instability, bordering on catastrophe in some instances.

Conjugate Direction Methods Under Inconsistent Systems

TL;DR

The paper studies CG and CR for symmetric inconsistent systems, formalizing difficulties in recovering normal solutions and introducing CG_pis to obtain the pseudo-inverse solution x^+ = A^+ b. It shows that CR can converge to a normal solution under no unlucky breakdown and is essentially equivalent to MINRES in this setting, with a simple CR_pis variant to reach the pseudo-inverse solution. A unifying framework clarifies that many properties of CD methods extend beyond the PD case, and extensive numerical experiments (synthetic and real-world) reveal that CG can be catastrophically unstable under inconsistency while CR and MINRES remain comparatively stable and effective. The findings have practical impact for choosing solvers in applications such as PDEs, image processing, and kernel methods where inconsistency or near-singularity arises.

Abstract

Since the development of the conjugate gradient (CG) method in 1952 by Hestenes and Stiefel, CG, has become an indispensable tool in computational mathematics for solving positive definite linear systems. On the other hand, the conjugate residual (CR) method, closely related CG and introduced by Stiefel in 1955 for the same settings, remains relatively less known outside the numerical linear algebra community. Since their inception, these methods -- henceforth collectively referred to as conjugate direction methods -- have been extended beyond positive definite to indefinite, albeit consistent, settings. Going one step further, in this paper, we investigate the theoretical and empirical properties of these methods under inconsistent systems. Among other things, we show that small modifications to the original algorithms allow for the pseudo-inverse solution. Furthermore, we show that CR is essentially equivalent to the minimum residual method, proposed by Paige and Saunders in 1975, in such contexts. Lastly, we conduct a series of numerical experiments to shed lights on their numerical stability (or lack thereof) and their performance for inconsistent systems. Surprisingly, we will demonstrate that, unlike CR and contrary to popular belief, CG can exhibit significant numerical instability, bordering on catastrophe in some instances.
Paper Structure (20 sections, 23 theorems, 84 equations, 16 figures, 3 tables, 5 algorithms)

This paper contains 20 sections, 23 theorems, 84 equations, 16 figures, 3 tables, 5 algorithms.

Key Result

Theorem 1

Let $\mathbf{b} \notin \text{Range}(\mathbf{A})$ and $\mathbf{b} \notin \text{Null}(\mathbf{A})$. Under assumpt:unlucky, in alg:cg, we have $\mathbf{A r}_k \neq \mathbf{0}$, for all $0 \leq k \leq g-1$.

Figures (16)

  • Figure 1: Our proof strategy for establishing equivalence between MINRES and CR. To prove \ref{['thm:mr=cr']}, we use \ref{['lem:consistent_mr=cr', 'lem:cr_tilde', 'lem:inconsisent_mr_same']}.
  • Figure 2: Experiments for \ref{['sec:convergence_cg']} to verify \ref{['lem:cg:Ap', 'thm:cg:Ar']}. The matrices are PSD with $d = 10$. From the top left to the bottom right, the grades are 6, 7, 8, 9, 10 and 10. The bottom right is a positive definite matrix.
  • Figure 3: Experiments for \ref{['sec:convergence_cg']} to verify \ref{['lem:cg:Ap', 'thm:cg:Ar']}. The matrices are indefinite with $d = 10$. From the top left to the bottom right, the grades are 6, 7, 8, 9, 10 and 10. The bottom right is a full rank indefinite matrix.
  • Figure 4: Experiments for \ref{['sec:instability_cg']} to explore instabilities within CG. The matrices are PSD with $d = 100$. From top left to bottom right, the grades are 11, 21, 41, 81, 100 and 100. The bottom right is a positive definite matrix for which CG terminates at iteration 59 with $\|\mathbf{r}_k\|/\|\mathbf{b}\| < 10^{-8}$.
  • Figure 5: Experiments for \ref{['sec:instability_cg']} to explore instabilities within CG. The matrices are indefinite with $d = 100$. From top left to bottom right, the grades are 11, 21, 41, 81, 100 and 100. The bottom right is an indefinite matrix with full rank for which CG reaches the maximum allowable iteration.
  • ...and 11 more figures

Theorems & Definitions (56)

  • Definition 1: Normal Solutions and the Pseudo-inverse Solution
  • Definition 2: Grade of $\mathbf{v}$ with respect to $\mathbf{A}$
  • Remark 1: Initialization
  • Remark 2: Symmetric Matrix and Non-zero Right-hand side
  • Definition 3: Zero-curvature Direction
  • Remark 3
  • Theorem 1: CG's Inability to Recover Normal Solutions
  • proof
  • Remark 4
  • Lemma 1
  • ...and 46 more