Table of Contents
Fetching ...

Asymptotic convergence of restarted Anderson acceleration for certain normal linear systems

Oliver A. Krzysik, Hans De Sterck, Adam Smith

TL;DR

This work analyzes restarted Anderson acceleration with restart size one (rAA(1)) for affine fixed-point iterations $\mathbf{x}_{k+1}=\mathbf{q}(\mathbf{x}_k)=M\mathbf{x}_k+\mathbf{b}$, focusing on the cases where $M$ is symmetric or skew-symmetric. For symmetric $M$, the authors derive an eigenvector-dependent nonlinear eigenvalue problem (NEPv) that governs the four-step residual map, show how the asymptotic convergence factor can depend strongly on the initial iterate, and provide explicit two-by-two closed forms that capture this dependence. For skew-symmetric $M$, they show that the rAA(1) residual iteration behaves like a power iteration on the dominant skew-symmetric subspace, leading to a worst-case convergence factor $\varrho^{\rm worst}_{\rm AA} = \rho(M)/(1+\rho(M)^2)^{1/4}$ that is independent of the initial residual, with a sharp convergence threshold $\rho(M) < \sqrt{(1+\sqrt{5})/2}$. Numerical experiments corroborate the theory, illustrate nonlinear extensions, and indicate applicability to more general nonlinear $\mathbf{q}$ with symmetric or skew-symmetric Jacobians at the fixed point. These results provide concrete, exact convergence factors in special but meaningful settings and offer insight into how AA accelerates fixed-point iterations beyond existing bounds, with implications for windowed variants and practical nonlinear problems.

Abstract

Anderson acceleration (AA) is widely used for accelerating the convergence of an underlying fixed-point iteration $\bm{x}_{k+1} = \bm{q}( \bm{x}_{k} )$, $k = 0, 1, \ldots$, with $\bm{x}_k \in \mathbb{R}^n$, $\bm{q} \colon \mathbb{R}^n \to \mathbb{R}^n$. Despite AA's widespread use, relatively little is understood theoretically about the extent to which it may accelerate the underlying fixed-point iteration. To this end, we analyze a restarted variant of AA with a restart size of one, a method closely related to GMRES(1). We consider the case of $\bm{q}( \bm{x} ) = M \bm{x} + \bm{b}$ with matrix $M \in \mathbb{R}^{n \times n}$ either symmetric or skew-symmetric. For both classes of $M$ we compute the worst-case root-average asymptotic convergence factor of the AA method, partially relying on conjecture in the symmetric setting, proving that it is strictly smaller than that of the underlying fixed-point iteration. For symmetric $M$, we show that the AA residual iteration corresponds to a fixed-point iteration for solving an eigenvector-dependent nonlinear eigenvalue problem (NEPv), and we show how this can result in the convergence factor strongly depending on the initial iterate, which we quantify exactly in certain special cases. Conversely, for skew-symmetric $M$ we show that the AA residual iteration is closely related to a power iteration for $M$, and how this results in the convergence factor being independent of the initial iterate. Supporting numerical results are given, which also indicate the theory is applicable to the more general setting of nonlinear $\bm{q}$ with Jacobian at the fixed point that is symmetric or skew symmetric.

Asymptotic convergence of restarted Anderson acceleration for certain normal linear systems

TL;DR

This work analyzes restarted Anderson acceleration with restart size one (rAA(1)) for affine fixed-point iterations , focusing on the cases where is symmetric or skew-symmetric. For symmetric , the authors derive an eigenvector-dependent nonlinear eigenvalue problem (NEPv) that governs the four-step residual map, show how the asymptotic convergence factor can depend strongly on the initial iterate, and provide explicit two-by-two closed forms that capture this dependence. For skew-symmetric , they show that the rAA(1) residual iteration behaves like a power iteration on the dominant skew-symmetric subspace, leading to a worst-case convergence factor that is independent of the initial residual, with a sharp convergence threshold . Numerical experiments corroborate the theory, illustrate nonlinear extensions, and indicate applicability to more general nonlinear with symmetric or skew-symmetric Jacobians at the fixed point. These results provide concrete, exact convergence factors in special but meaningful settings and offer insight into how AA accelerates fixed-point iterations beyond existing bounds, with implications for windowed variants and practical nonlinear problems.

Abstract

Anderson acceleration (AA) is widely used for accelerating the convergence of an underlying fixed-point iteration , , with , . Despite AA's widespread use, relatively little is understood theoretically about the extent to which it may accelerate the underlying fixed-point iteration. To this end, we analyze a restarted variant of AA with a restart size of one, a method closely related to GMRES(1). We consider the case of with matrix either symmetric or skew-symmetric. For both classes of we compute the worst-case root-average asymptotic convergence factor of the AA method, partially relying on conjecture in the symmetric setting, proving that it is strictly smaller than that of the underlying fixed-point iteration. For symmetric , we show that the AA residual iteration corresponds to a fixed-point iteration for solving an eigenvector-dependent nonlinear eigenvalue problem (NEPv), and we show how this can result in the convergence factor strongly depending on the initial iterate, which we quantify exactly in certain special cases. Conversely, for skew-symmetric we show that the AA residual iteration is closely related to a power iteration for , and how this results in the convergence factor being independent of the initial iterate. Supporting numerical results are given, which also indicate the theory is applicable to the more general setting of nonlinear with Jacobian at the fixed point that is symmetric or skew symmetric.
Paper Structure (26 sections, 14 theorems, 83 equations, 9 figures)

This paper contains 26 sections, 14 theorems, 83 equations, 9 figures.

Key Result

Theorem 3.2

\newlabelthm:ostrowski1 Suppose that ${\@fontswitch{}{\mathcal{}} A} \colon D \subset \mathbb{R}^n \to \mathbb{R}^n$ has a fixed-point $\bm{x}_*$ that is an interior point of $D$ and is differentiable at $\bm{x}_*$. If the spectral radius of ${\@fontswitch{}{\mathcal{}} A}'( \bm{x}_* )$ satisfies

Figures (9)

  • Figure 1: Convergence of rAA(1) for symmetric $M \in \mathbb{R}^{2 \times 2}$ with eigenvalues $m_1$ and $m_2$. Top: Cross-sections of the convergence factor \ref{['eq:rAA1-rho-general']} as a function of $\varepsilon \in [10^{-2}, 10^2]$ at $(m_1, m_2)$ indicated in the legend. Bottom left: Worst-case convergence factor for rAA(1) as in \ref{['thm:rho-symm']}. Bottom right: Ratio of the rAA(1) worst-case convergence factor to that of the underlying fixed-point iteration; the dashed black line is the only region where $\varrho_{\rm AA}^{\rm worst} < \varrho_{\rm FP}^{\rm worst}$ does not hold. \newlabelfig:symmM-2x21
  • Figure 1: For skew-symmetric matrices $M$ with spectral radius $\rho(M)$. Left: Worst-case root-linear convergence factor, as per \ref{['cor:skew-worst-case']}. Right: For a given $\rho(M)$, the number of iterations $k_*$ required to reduce the error by a factor of $10^{-\nu_*}$, with values of $\nu_*$ indicated in the legend. Asterisks markers represent AA, and triangle markers the underlying fixed-point iteration. \newlabelfig:rho-wc-skew1
  • Figure 2: Supporting numerical evidence for \ref{['thm:rho-symm']} using three symmetric matrices $M$. For each $M$, shown is the convergence factor $\varrho_k(\mathbf{r}_0)$ as a function of $k$ (see \ref{['def:conv-fac']}). Triangle markers depict the underlying fixed-point iteration \ref{['eq:PI-iter']}, and asterisk markers the rAA(1) iteration \ref{['eq:rAA1-iter']}. For each matrix, each algorithm is initialized with 30 different $\bm{r}_0$ chosen at random. The thick black dotted line is the worst-case convergence factor for the underlying fixed-point iteration, and the thick blue dashed line is the lower bound for that of the rAA(1) iteration given in \ref{['thm:rho-symm']}. The bottom right plot shows the eigenvalues for each test matrix $M$. \newlabelfig:symm-rhok1
  • Figure 2: For two matrices $M = \tfrac{1}{2} \delta t \mathrm{D} \Lambda$ similar to skew symmetric matrices, shown is the convergence factor $\varrho_k(\mathbf{r}_0)$ as a function of the iteration index $k$ that limits to the asymptotic convergence factor as ${k \to \infty}$ (see \ref{['def:conv-fac']}). Triangle markers depict the underlying fixed-point iteration \ref{['eq:PI-iter']}, and asterisk markers the rAA(1) iteration \ref{['eq:rAA1-iter']}. For both problems, each algorithm is initialized with 30 different $\bm{r}_0$ chosen at random. Left:$\mathrm{D}$ is a 2nd-order accurate finite-difference discretization using $n = 64$ points. Right:$\mathrm{D}$ is a Fourier spectral differentiation matrix with $n = 31$ collocation points. The worst-case asymptotic convergence factor \ref{['eq:rho-PI-wc']} for the underlying fixed-point iteration is shown in each plot as a thick, black dotted line, and that of \ref{['eq:rho-skew-wc']} for the rAA(1) iteration is shown as the thick, blue dashed line. \newlabelfig:skew-advection1
  • Figure 3: Supporting numerical evidence for the extension of results from the linear to nonlinear setting by solving a nonlinear elliptic boundary value problem with an inexact Newton algorithm. Triangle markers correspond to the underlying fixed-point iteration \ref{['eq:nonlin-q-MG']}, and asterisk markers to the associated rAA(1) iteration. Left: Numerically measured root-convergence factor. The thick broken lines reflect the theoretically predicted worst-case convergence factor for each method. Right: Relative reduction in residual as a function of iteration. \newlabelfig:nonlin-MG1
  • ...and 4 more figures

Theorems & Definitions (34)

  • Definition 3.1: Root convergence
  • Theorem 3.2: Ostrowski
  • Definition 3.3: Directional derivative
  • Lemma 3.4: Continuity and differentiability of ${\@fontswitch{}{\mathcal{}} R}$
  • Proof 1
  • Theorem 4.1
  • Proof 2
  • Conjecture 4.2
  • Theorem 4.3
  • Proof 3
  • ...and 24 more