Table of Contents
Fetching ...

Towards understanding CG and GMRES through examples

Erin Carson, Jörg Liesen, Zdeněk Strakoš

TL;DR

This paper reframes CG and GMRES as nonlinear, data-adaptive Krylov methods and demonstrates their behavior through carefully chosen computed examples. By contrasting exact and finite-precision performance, it shows that traditional linear bounds (e.g., condition-number-based convergence) fail to capture key phenomena driven by eigenvalue distribution, nonnormality, and residual-versus-error dynamics. It highlights the sensitive role of preconditioning beyond mere spectrum clustering, and emphasizes robust stopping via normwise backward error rather than residual norms. The work argues for a deeper, nonlocal understanding of Krylov methods, bridging finite-precision insights with operator-theoretic concepts to guide practical algorithm design and open problems in the field.

Abstract

When the CG method for solving linear algebraic systems was formulated about 70 years ago by Lanczos, Hestenes, and Stiefel, it was considered an iterative process possessing a mathematical finite termination property. CG was placed into a rich mathematical context, including links with Gauss quadrature and continued fractions. The optimality property of CG was described via a normalized weighted polynomial least squares approximation to zero. This highly nonlinear problem explains the adaptation of CG iterates to the given data. Karush and Hayes immediately considered CG in infinite dimensional Hilbert spaces and investigated its superlinear convergence. Since then, the view of CG and other Krylov subspace methods has changed. Today these methods are primarily used as computational tools, and their behavior is typically characterized using linear upper bounds or heuristics based on clustering of eigenvalues. Such simplifications limit the mathematical understanding and also negatively affect their practical application. This paper offers a different perspective. Focusing on CG and GMRES, it presents mathematically important and practically relevant phenomena that uncover their behavior through a discussion of computed examples. These examples provide an easily accessible approach that enables understanding of the methods, while pointers to more detailed analyses in the literature are given. This approach allows readers to choose the level of depth and thoroughness appropriate for their intentions. Some of the points made in this paper illustrate well known facts. Others challenge mainstream views and explain existing misunderstandings. Several points refer to recent results leading to open problems. We consider CG and GMRES crucially important for the mathematical understanding, further development, and practical applications also of other Krylov subspace methods.

Towards understanding CG and GMRES through examples

TL;DR

This paper reframes CG and GMRES as nonlinear, data-adaptive Krylov methods and demonstrates their behavior through carefully chosen computed examples. By contrasting exact and finite-precision performance, it shows that traditional linear bounds (e.g., condition-number-based convergence) fail to capture key phenomena driven by eigenvalue distribution, nonnormality, and residual-versus-error dynamics. It highlights the sensitive role of preconditioning beyond mere spectrum clustering, and emphasizes robust stopping via normwise backward error rather than residual norms. The work argues for a deeper, nonlocal understanding of Krylov methods, bridging finite-precision insights with operator-theoretic concepts to guide practical algorithm design and open problems in the field.

Abstract

When the CG method for solving linear algebraic systems was formulated about 70 years ago by Lanczos, Hestenes, and Stiefel, it was considered an iterative process possessing a mathematical finite termination property. CG was placed into a rich mathematical context, including links with Gauss quadrature and continued fractions. The optimality property of CG was described via a normalized weighted polynomial least squares approximation to zero. This highly nonlinear problem explains the adaptation of CG iterates to the given data. Karush and Hayes immediately considered CG in infinite dimensional Hilbert spaces and investigated its superlinear convergence. Since then, the view of CG and other Krylov subspace methods has changed. Today these methods are primarily used as computational tools, and their behavior is typically characterized using linear upper bounds or heuristics based on clustering of eigenvalues. Such simplifications limit the mathematical understanding and also negatively affect their practical application. This paper offers a different perspective. Focusing on CG and GMRES, it presents mathematically important and practically relevant phenomena that uncover their behavior through a discussion of computed examples. These examples provide an easily accessible approach that enables understanding of the methods, while pointers to more detailed analyses in the literature are given. This approach allows readers to choose the level of depth and thoroughness appropriate for their intentions. Some of the points made in this paper illustrate well known facts. Others challenge mainstream views and explain existing misunderstandings. Several points refer to recent results leading to open problems. We consider CG and GMRES crucially important for the mathematical understanding, further development, and practical applications also of other Krylov subspace methods.
Paper Structure (24 sections, 37 equations, 19 figures, 2 algorithms)

This paper contains 24 sections, 37 equations, 19 figures, 2 algorithms.

Figures (19)

  • Figure 1: Left: Three distributions of 30 eigenvalues in $[0.1,\,10^3]$. Right: The relative error in the $A$-norm for exact CG applied to the corresponding linear algebraic systems.
  • Figure 1: Residual norms of GMRES applied to the two linear algebraic systems constructed with prescribed eigenvalues and convergence curves (solid blue for $\lambda_1=\cdots=\lambda_N=1$ and dashed red for $\lambda_j=j$).
  • Figure 2: Cumulative spectral density plots for Figure \ref{['fig:cgeigdist']}.
  • Figure 2: Left: Eigenvalues of 100 matrices $D_N$ and the boundary of the unit disk. Right: Relative GMRES residual norms for linear algebraic systems with matrices $D_N$, $1.2 I+D_N$, and $2I+0.5 D_N$, and the upper bounds from \ref{['eqn:GMRES-disk']}.
  • Figure 3: The relative error in the $A$-norm for exact CG run on three problems with matrices having different distributions of 10 eigenvalue clusters, where each cluster contains 10 eigenvalues.
  • ...and 14 more figures