Table of Contents
Fetching ...

Accelerated Proximal Gradient Methods in the affine-quadratic case: Strong convergence and limit identification

Walaa M. Moursi, Andrew Naguib, Viktor Pavlovic, Stephen A. Vavasis

TL;DR

This work analyzes accelerated proximal gradient methods in the affine-quadratic regime, where f is quadratic and g is the indicator of a closed affine subspace. By recasting APG updates in terms of an affine nonexpansive operator, the authors establish that the APG limit coincides with the best-approximation projection of the starting point onto the solution set, and that the difference between APG and PGM iterates vanishes weakly. Under mild conditions on the momentum parameters, strong convergence follows, and the results are shown to be tight via a two-dimensional counterexample demonstrating non-coincidence outside the affine-quadratic setting. The paper also extends the analysis to cones and affine subspaces and provides numerical experiments illustrating limit identification in underdetermined image reconstruction problems. Overall, the findings clarify when APG shares the PGM limit and under what conditions it converges strongly, contributing to the understanding and practical deployment of accelerated methods in convex optimization.

Abstract

Recent works by Bot-Fadili-Nguyen (arXiv:2510.22715) and by Jang-Ryu (arXiv:2510.23513) resolve long-standing iterate convergence questions for accelerated (proximal) gradient methods. In particular, Bot-Fadili-Nguyen prove weak convergence of discrete accelerated gradient descent (AGD) iterates and, crucially, convergence of the accelerated proximal gradient (APG) method in the composite setting, with extensions to infinite-dimensional Hilbert spaces. In parallel, Jang-Ryu establish point convergence for the continuous-time accelerated flow and for discrete AGD in finite dimensions. These results leave open which minimizer is selected by the iterates. We answer this in the affine-quadratic setting: when initialized at the same point, the difference between the proximal gradient (PGM) and APG iterates converges weakly to zero. Consequently, APG converges weakly to the best approximation of the initial point in the solution set. Moreover, under mild assumptions on the parameter sequence, we obtain strong convergence of APG. The result is tight: a two-dimensional example shows that coincidence of the APG and PGM limits is specific to the affine-quadratic regime and does not hold in general.

Accelerated Proximal Gradient Methods in the affine-quadratic case: Strong convergence and limit identification

TL;DR

This work analyzes accelerated proximal gradient methods in the affine-quadratic regime, where f is quadratic and g is the indicator of a closed affine subspace. By recasting APG updates in terms of an affine nonexpansive operator, the authors establish that the APG limit coincides with the best-approximation projection of the starting point onto the solution set, and that the difference between APG and PGM iterates vanishes weakly. Under mild conditions on the momentum parameters, strong convergence follows, and the results are shown to be tight via a two-dimensional counterexample demonstrating non-coincidence outside the affine-quadratic setting. The paper also extends the analysis to cones and affine subspaces and provides numerical experiments illustrating limit identification in underdetermined image reconstruction problems. Overall, the findings clarify when APG shares the PGM limit and under what conditions it converges strongly, contributing to the understanding and practical deployment of accelerated methods in convex optimization.

Abstract

Recent works by Bot-Fadili-Nguyen (arXiv:2510.22715) and by Jang-Ryu (arXiv:2510.23513) resolve long-standing iterate convergence questions for accelerated (proximal) gradient methods. In particular, Bot-Fadili-Nguyen prove weak convergence of discrete accelerated gradient descent (AGD) iterates and, crucially, convergence of the accelerated proximal gradient (APG) method in the composite setting, with extensions to infinite-dimensional Hilbert spaces. In parallel, Jang-Ryu establish point convergence for the continuous-time accelerated flow and for discrete AGD in finite dimensions. These results leave open which minimizer is selected by the iterates. We answer this in the affine-quadratic setting: when initialized at the same point, the difference between the proximal gradient (PGM) and APG iterates converges weakly to zero. Consequently, APG converges weakly to the best approximation of the initial point in the solution set. Moreover, under mild assumptions on the parameter sequence, we obtain strong convergence of APG. The result is tight: a two-dimensional example shows that coincidence of the APG and PGM limits is specific to the affine-quadratic regime and does not hold in general.

Paper Structure

This paper contains 14 sections, 18 theorems, 101 equations, 3 figures.

Key Result

Theorem 2.1

Let $W$ be a closed affine subspace of $X$, let $(x_k)_{k\in{\mathbb N}}$ and $(p_k)_{k\in{\mathbb N}}$ be sequences in $X$, and let $(s_k)_{k\in{\mathbb N}}$ be a sequence in $(\operatorname{par} W)^\perp$. Suppose that $(\forall {k\in{\mathbb N}})$ Then the following hold.

Figures (3)

  • Figure 1: A Python plot illustrating \ref{['prop:example-w2']}. Shown are the initial point $(5,0)$, the first $40$ iterates of the MAP iterates $(p_k)_{k\in{\mathbb N}}$ and FISTA iterates $(x_k)_{k\in{\mathbb N}}$ along with the limits $p^*$ and $x^*$.
  • Figure 2: \ref{['fig:base_img']} is the preceding frame to the corrupted one we receive, \ref{['fig:uncorr_img']} is the uncorrupted frame (the one we try to recover), and \ref{['fig:corr_img']} is the corrupted version with $40\%$ of pixels missing.
  • Figure 3: \ref{['fig:recon_ones']} reconstruction from $x_0=(1,1, \ldots,1)$, \ref{['fig:recon_zeros']} reconstruction from $x_0=(0,0, \ldots,0)$, and \ref{['fig:recon_rand']} reconstruction from a random $x_0\in \mathbb{R}^{65{,}536}$.

Theorems & Definitions (48)

  • Example 1: Method of Alternating Projections (MAP) as a PGM iterate.
  • Remark 1
  • Remark 2
  • proof
  • Theorem 2.1
  • proof
  • Lemma 1
  • proof
  • Proposition 1
  • proof
  • ...and 38 more