Table of Contents
Fetching ...

The iterates of FISTA convergence even under inexact computations and stochastic gradients

Saverio Salzo

TL;DR

It is proved that, in infinite dimensional Hilbert spaces, the iterates of such an algorithm still converge (in the weak sense) even when the proximity operator and the gradient are computed inexactly, with the latter possibly stochastic.

Abstract

Very recently, the papers "Point Convergence of Nesterov's Accelerated Gradient Method: An AI-Assisted Proof" by Jang and Ryu, and "The Iterates of Nesterov's Accelerated Algorithm Converge in the Critical Regimes" by Bot, Fadili, and Nguyen simultaneously have resolved a long-standing open problem concerning Nesterov's accelerated gradient method. These works show that the iterates of the algorithm (known in its composite form as FISTA) indeed converge to an optimal solution. In this work, we extend these results and prove that, in infinite dimensional Hilbert spaces, the iterates of such an algorithm still converge (in the weak sense) even when the proximity operator and the gradient are computed inexactly, with the latter possibly stochastic.

The iterates of FISTA convergence even under inexact computations and stochastic gradients

TL;DR

It is proved that, in infinite dimensional Hilbert spaces, the iterates of such an algorithm still converge (in the weak sense) even when the proximity operator and the gradient are computed inexactly, with the latter possibly stochastic.

Abstract

Very recently, the papers "Point Convergence of Nesterov's Accelerated Gradient Method: An AI-Assisted Proof" by Jang and Ryu, and "The Iterates of Nesterov's Accelerated Algorithm Converge in the Critical Regimes" by Bot, Fadili, and Nguyen simultaneously have resolved a long-standing open problem concerning Nesterov's accelerated gradient method. These works show that the iterates of the algorithm (known in its composite form as FISTA) indeed converge to an optimal solution. In this work, we extend these results and prove that, in infinite dimensional Hilbert spaces, the iterates of such an algorithm still converge (in the weak sense) even when the proximity operator and the gradient are computed inexactly, with the latter possibly stochastic.

Paper Structure

This paper contains 11 sections, 7 theorems, 98 equations, 2 algorithms.

Key Result

Proposition 1.2

Let $g\colon \mathcal{H}\to \left]-\infty,+\infty\right]$ be a proper convex and lower semicontinuous function, $y, z\in \mathcal{H}$, and $\delta\in \mathbb{R}_+$. Let $\gamma>0$ and suppose that $z \simeq_\delta \mathop{\mathrm{prox}}\nolimits_{\gamma g}(y)$. Then the following hold.

Theorems & Definitions (27)

  • Definition 1.1
  • Proposition 1.2: Lemma 2.4 Salzo-Villa12
  • Remark 2.2
  • proof
  • Remark 2.5
  • proof : Proof (Sketch).
  • Proposition 3.1
  • proof
  • Remark 3.2
  • Lemma 3.3
  • ...and 17 more