Table of Contents
Fetching ...

Proximal gradient methods beyond monotony

Alberto De Marchi

TL;DR

An adaptive nonmonotone proximal gradient scheme based on an averaged merit function is considered and asymptotic convergence guarantees under weak assumptions are established, delivering results on par with the monotone strategy.

Abstract

We address composite optimization problems, which consist in minimizing the sum of a smooth and a merely lower semicontinuous function, without any convexity assumptions. Numerical solutions of these problems can be obtained by proximal gradient methods, which often rely on a line search procedure as globalization mechanism. We consider an adaptive nonmonotone proximal gradient scheme based on an averaged merit function and establish asymptotic convergence guarantees under weak assumptions, delivering results on par with the monotone strategy. Global worst-case rates for the iterates and a stationarity measure are also derived. Finally, a numerical example indicates the potential of nonmonotonicity and spectral approximations.

Proximal gradient methods beyond monotony

TL;DR

An adaptive nonmonotone proximal gradient scheme based on an averaged merit function is considered and asymptotic convergence guarantees under weak assumptions are established, delivering results on par with the monotone strategy.

Abstract

We address composite optimization problems, which consist in minimizing the sum of a smooth and a merely lower semicontinuous function, without any convexity assumptions. Numerical solutions of these problems can be obtained by proximal gradient methods, which often rely on a line search procedure as globalization mechanism. We consider an adaptive nonmonotone proximal gradient scheme based on an averaged merit function and establish asymptotic convergence guarantees under weak assumptions, delivering results on par with the monotone strategy. Global worst-case rates for the iterates and a stationarity measure are also derived. Finally, a numerical example indicates the potential of nonmonotonicity and spectral approximations.
Paper Structure (11 sections, 11 theorems, 29 equations, 1 figure)

This paper contains 11 sections, 11 theorems, 29 equations, 1 figure.

Key Result

Lemma 4.1

Suppose that ass:fass:phi are satisfied. Consider the $k$-th iteration of alg:NMPG and assume that at least one of the following conditions holds: Then, the iteration terminates, and in particular state:NMPG:ls is passed in finitely many backtrackings.

Figures (1)

  • Figure 1: Comparing variants of \ref{['alg:NMPG']} on dictionary learning \ref{['eq:dictionarylearning']}: combinations of plain (dotted) and spectral (solid) stepsizes with monotone and nonmonotone linesearch strategies.

Theorems & Definitions (22)

  • Lemma 4.1: Well-definedness
  • proof
  • Lemma 4.2: Descent behavior
  • proof
  • Lemma 4.3
  • proof
  • Lemma 4.4
  • proof
  • Corollary 4.5
  • Theorem 4.6
  • ...and 12 more