Asymptotic convergence of restarted Anderson acceleration for certain normal linear systems

Oliver A. Krzysik; Hans De Sterck; Adam Smith

Asymptotic convergence of restarted Anderson acceleration for certain normal linear systems

Oliver A. Krzysik, Hans De Sterck, Adam Smith

TL;DR

This work analyzes restarted Anderson acceleration with restart size one (rAA(1)) for affine fixed-point iterations $\mathbf{x}_{k+1}=\mathbf{q}(\mathbf{x}_k)=M\mathbf{x}_k+\mathbf{b}$, focusing on the cases where $M$ is symmetric or skew-symmetric. For symmetric $M$, the authors derive an eigenvector-dependent nonlinear eigenvalue problem (NEPv) that governs the four-step residual map, show how the asymptotic convergence factor can depend strongly on the initial iterate, and provide explicit two-by-two closed forms that capture this dependence. For skew-symmetric $M$, they show that the rAA(1) residual iteration behaves like a power iteration on the dominant skew-symmetric subspace, leading to a worst-case convergence factor $\varrho^{\rm worst}_{\rm AA} = \rho(M)/(1+\rho(M)^2)^{1/4}$ that is independent of the initial residual, with a sharp convergence threshold $\rho(M) < \sqrt{(1+\sqrt{5})/2}$. Numerical experiments corroborate the theory, illustrate nonlinear extensions, and indicate applicability to more general nonlinear $\mathbf{q}$ with symmetric or skew-symmetric Jacobians at the fixed point. These results provide concrete, exact convergence factors in special but meaningful settings and offer insight into how AA accelerates fixed-point iterations beyond existing bounds, with implications for windowed variants and practical nonlinear problems.

Abstract

Anderson acceleration (AA) is widely used for accelerating the convergence of an underlying fixed-point iteration $\bm{x}_{k+1} = \bm{q}( \bm{x}_{k} )$, $k = 0, 1, \ldots$, with $\bm{x}_k \in \mathbb{R}^n$, $\bm{q} \colon \mathbb{R}^n \to \mathbb{R}^n$. Despite AA's widespread use, relatively little is understood theoretically about the extent to which it may accelerate the underlying fixed-point iteration. To this end, we analyze a restarted variant of AA with a restart size of one, a method closely related to GMRES(1). We consider the case of $\bm{q}( \bm{x} ) = M \bm{x} + \bm{b}$ with matrix $M \in \mathbb{R}^{n \times n}$ either symmetric or skew-symmetric. For both classes of $M$ we compute the worst-case root-average asymptotic convergence factor of the AA method, partially relying on conjecture in the symmetric setting, proving that it is strictly smaller than that of the underlying fixed-point iteration. For symmetric $M$, we show that the AA residual iteration corresponds to a fixed-point iteration for solving an eigenvector-dependent nonlinear eigenvalue problem (NEPv), and we show how this can result in the convergence factor strongly depending on the initial iterate, which we quantify exactly in certain special cases. Conversely, for skew-symmetric $M$ we show that the AA residual iteration is closely related to a power iteration for $M$, and how this results in the convergence factor being independent of the initial iterate. Supporting numerical results are given, which also indicate the theory is applicable to the more general setting of nonlinear $\bm{q}$ with Jacobian at the fixed point that is symmetric or skew symmetric.

Asymptotic convergence of restarted Anderson acceleration for certain normal linear systems

TL;DR

This work analyzes restarted Anderson acceleration with restart size one (rAA(1)) for affine fixed-point iterations

, focusing on the cases where

is symmetric or skew-symmetric. For symmetric

, the authors derive an eigenvector-dependent nonlinear eigenvalue problem (NEPv) that governs the four-step residual map, show how the asymptotic convergence factor can depend strongly on the initial iterate, and provide explicit two-by-two closed forms that capture this dependence. For skew-symmetric

, they show that the rAA(1) residual iteration behaves like a power iteration on the dominant skew-symmetric subspace, leading to a worst-case convergence factor

that is independent of the initial residual, with a sharp convergence threshold

. Numerical experiments corroborate the theory, illustrate nonlinear extensions, and indicate applicability to more general nonlinear

with symmetric or skew-symmetric Jacobians at the fixed point. These results provide concrete, exact convergence factors in special but meaningful settings and offer insight into how AA accelerates fixed-point iterations beyond existing bounds, with implications for windowed variants and practical nonlinear problems.

Abstract

Anderson acceleration (AA) is widely used for accelerating the convergence of an underlying fixed-point iteration

, with

. Despite AA's widespread use, relatively little is understood theoretically about the extent to which it may accelerate the underlying fixed-point iteration. To this end, we analyze a restarted variant of AA with a restart size of one, a method closely related to GMRES(1). We consider the case of

with matrix

either symmetric or skew-symmetric. For both classes of

we compute the worst-case root-average asymptotic convergence factor of the AA method, partially relying on conjecture in the symmetric setting, proving that it is strictly smaller than that of the underlying fixed-point iteration. For symmetric

, we show that the AA residual iteration corresponds to a fixed-point iteration for solving an eigenvector-dependent nonlinear eigenvalue problem (NEPv), and we show how this can result in the convergence factor strongly depending on the initial iterate, which we quantify exactly in certain special cases. Conversely, for skew-symmetric

we show that the AA residual iteration is closely related to a power iteration for

, and how this results in the convergence factor being independent of the initial iterate. Supporting numerical results are given, which also indicate the theory is applicable to the more general setting of nonlinear

with Jacobian at the fixed point that is symmetric or skew symmetric.

Paper Structure (26 sections, 14 theorems, 83 equations, 9 figures)

This paper contains 26 sections, 14 theorems, 83 equations, 9 figures.

Introduction
Preliminaries
Simplifying assumptions
Residual propagation
Convergence notions, and differentiability of residual propagator
Convergence of rAA(1) for symmetric $M$
An eigenvector-dependent nonlinear eigenvalue problem
Convergence factor
Numerical results
Numerical results: A nonlinear extension
Convergence of rAA(1) for skew symmetric $M$
Preliminaries
Convergence factor
Numerical results
Independence of convergence factor on initial iterate
...and 11 more sections

Key Result

Theorem 3.2

\newlabelthm:ostrowski1 Suppose that ${\@fontswitch{}{\mathcal{}} A} \colon D \subset \mathbb{R}^n \to \mathbb{R}^n$ has a fixed-point $\bm{x}_*$ that is an interior point of $D$ and is differentiable at $\bm{x}_*$. If the spectral radius of ${\@fontswitch{}{\mathcal{}} A}'( \bm{x}_* )$ satisfies

Figures (9)

Figure 1: Convergence of rAA(1) for symmetric $M \in \mathbb{R}^{2 \times 2}$ with eigenvalues $m_1$ and $m_2$. Top: Cross-sections of the convergence factor \ref{['eq:rAA1-rho-general']} as a function of $\varepsilon \in [10^{-2}, 10^2]$ at $(m_1, m_2)$ indicated in the legend. Bottom left: Worst-case convergence factor for rAA(1) as in \ref{['thm:rho-symm']}. Bottom right: Ratio of the rAA(1) worst-case convergence factor to that of the underlying fixed-point iteration; the dashed black line is the only region where $\varrho_{\rm AA}^{\rm worst} < \varrho_{\rm FP}^{\rm worst}$ does not hold. \newlabelfig:symmM-2x21
Figure 1: For skew-symmetric matrices $M$ with spectral radius $\rho(M)$. Left: Worst-case root-linear convergence factor, as per \ref{['cor:skew-worst-case']}. Right: For a given $\rho(M)$, the number of iterations $k_*$ required to reduce the error by a factor of $10^{-\nu_*}$, with values of $\nu_*$ indicated in the legend. Asterisks markers represent AA, and triangle markers the underlying fixed-point iteration. \newlabelfig:rho-wc-skew1
Figure 2: Supporting numerical evidence for \ref{['thm:rho-symm']} using three symmetric matrices $M$. For each $M$, shown is the convergence factor $\varrho_k(\mathbf{r}_0)$ as a function of $k$ (see \ref{['def:conv-fac']}). Triangle markers depict the underlying fixed-point iteration \ref{['eq:PI-iter']}, and asterisk markers the rAA(1) iteration \ref{['eq:rAA1-iter']}. For each matrix, each algorithm is initialized with 30 different $\bm{r}_0$ chosen at random. The thick black dotted line is the worst-case convergence factor for the underlying fixed-point iteration, and the thick blue dashed line is the lower bound for that of the rAA(1) iteration given in \ref{['thm:rho-symm']}. The bottom right plot shows the eigenvalues for each test matrix $M$. \newlabelfig:symm-rhok1
Figure 2: For two matrices $M = \tfrac{1}{2} \delta t \mathrm{D} \Lambda$ similar to skew symmetric matrices, shown is the convergence factor $\varrho_k(\mathbf{r}_0)$ as a function of the iteration index $k$ that limits to the asymptotic convergence factor as ${k \to \infty}$ (see \ref{['def:conv-fac']}). Triangle markers depict the underlying fixed-point iteration \ref{['eq:PI-iter']}, and asterisk markers the rAA(1) iteration \ref{['eq:rAA1-iter']}. For both problems, each algorithm is initialized with 30 different $\bm{r}_0$ chosen at random. Left:$\mathrm{D}$ is a 2nd-order accurate finite-difference discretization using $n = 64$ points. Right:$\mathrm{D}$ is a Fourier spectral differentiation matrix with $n = 31$ collocation points. The worst-case asymptotic convergence factor \ref{['eq:rho-PI-wc']} for the underlying fixed-point iteration is shown in each plot as a thick, black dotted line, and that of \ref{['eq:rho-skew-wc']} for the rAA(1) iteration is shown as the thick, blue dashed line. \newlabelfig:skew-advection1
Figure 3: Supporting numerical evidence for the extension of results from the linear to nonlinear setting by solving a nonlinear elliptic boundary value problem with an inexact Newton algorithm. Triangle markers correspond to the underlying fixed-point iteration \ref{['eq:nonlin-q-MG']}, and asterisk markers to the associated rAA(1) iteration. Left: Numerically measured root-convergence factor. The thick broken lines reflect the theoretically predicted worst-case convergence factor for each method. Right: Relative reduction in residual as a function of iteration. \newlabelfig:nonlin-MG1
...and 4 more figures

Theorems & Definitions (34)

Definition 3.1: Root convergence
Theorem 3.2: Ostrowski
Definition 3.3: Directional derivative
Lemma 3.4: Continuity and differentiability of ${\@fontswitch{}{\mathcal{}} R}$
Proof 1
Theorem 4.1
Proof 2
Conjecture 4.2
Theorem 4.3
Proof 3
...and 24 more

Asymptotic convergence of restarted Anderson acceleration for certain normal linear systems

TL;DR

Abstract

Asymptotic convergence of restarted Anderson acceleration for certain normal linear systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (34)