Anderson Acceleration with Truncated Gram-Schmidt
Ziyuan Tang, Tianshi Xu, Huan He, Yousef Saad, Yuanzhe Xi
TL;DR
This work introduces Anderson Acceleration with Truncated Gram-Schmidt (AATGS), a variant of Anderson Acceleration that builds a locally orthonormal basis via truncated Gram-Schmidt to reduce memory and computation, while preserving convergence behavior in many cases. The authors prove that AATGS(∞) is equivalent to AA(∞) for linear problems, and demonstrate a short-term recurrence in the symmetric linear case that dramatically lowers memory and cost (AATGS(3) matches AATGS(∞) there). They also develop a lightweight restarting strategy to mitigate numerical instability and validate the method across nonlinear PDEs and optimization tasks, showing advantages over classical AA when the Jacobian is near-symmetric or the Hessian is symmetric. The approach yields robust performance with automatic restarts and broad applicability to nonlinear equations and minimax optimization, offering practical gains in large-scale computations. Future directions include extending AATGS to stochastic settings and leveraging Jacobian information to further improve robustness and efficiency.
Abstract
Anderson Acceleration (AA) is a popular algorithm designed to enhance the convergence of fixed-point iterations. In this paper, we introduce a variant of AA based on a Truncated Gram-Schmidt process (AATGS) which has a few advantages over the classical AA. In particular, an attractive feature of AATGS is that its iterates obey a three-term recurrence in the situation when it is applied to solving symmetric linear problems and this can lead to a considerable reduction of memory and computational costs. We analyze the convergence of AATGS in both full-depth and limited-depth scenarios and establish its equivalence to the classical AA in the linear case. We also report on the effectiveness of AATGS through a set of numerical experiments, ranging from solving nonlinear partial differential equations to tackling nonlinear optimization problems. In particular, the performance of the method is compared with that of the classical AA algorithms.
