Table of Contents
Fetching ...

Efficient globalization of heavy-ball type methods for unconstrained optimization based on curve searches

Federica Donnini, Matteo Lapucci, Pierluigi Mansueto

TL;DR

This paper develops a curve-search globalization framework for unconstrained optimization, focusing on Polyak's heavy-ball updates. The method backtracks along a parametric search curve starting from the tentative update to enforce an Armijo-type decrease, enabling global convergence even in nonconvex settings. By analyzing quadratic search curves, it proves worst-case complexity $\mathcal{O}(\epsilon^{-2})$ to reach approximate stationarity and shows that heavy-ball steps can be retained in strongly convex problems. Numerical experiments on strongly convex and CUTEst problems demonstrate practical efficiency gains over classical safeguards, with code available publicly.

Abstract

In this work, we deal with unconstrained nonlinear optimization problems. Specifically, we are interested in methods carrying out updates possibly along directions not of descent, like Polyak's heavy-ball algorithm. Instead of enforcing convergence properties through line searches and modifications of search direction when suitable safeguards are not satisfied, we propose a strategy based on searches along curve paths: a curve search starting from the first tentative update allows to smoothly revert towards a gradient-related direction if a sufficient decrease condition is not met. The resulting algorithm provably possesses global convergence guarantees, even with a nonmonotone decrease condition. While the presented framework is rather general, particularly of interest is the case of parabolic searches; in this case, under reasonable assumptions, the resulting algorithm can be shown to possess optimal worst case complexity bounds for reaching approximate stationarity in nonconvex settings. Practically, we show that the proposed globalization strategy allows to consistently accept (optimal) pure heavy-ball steps in the strongly convex case, while standard globalization approaches would at times negate them before even evaluating the objective function. Preliminary computational experiments also suggest that the proposed framework might be more convenient than classical safeguard based approaches.

Efficient globalization of heavy-ball type methods for unconstrained optimization based on curve searches

TL;DR

This paper develops a curve-search globalization framework for unconstrained optimization, focusing on Polyak's heavy-ball updates. The method backtracks along a parametric search curve starting from the tentative update to enforce an Armijo-type decrease, enabling global convergence even in nonconvex settings. By analyzing quadratic search curves, it proves worst-case complexity to reach approximate stationarity and shows that heavy-ball steps can be retained in strongly convex problems. Numerical experiments on strongly convex and CUTEst problems demonstrate practical efficiency gains over classical safeguards, with code available publicly.

Abstract

In this work, we deal with unconstrained nonlinear optimization problems. Specifically, we are interested in methods carrying out updates possibly along directions not of descent, like Polyak's heavy-ball algorithm. Instead of enforcing convergence properties through line searches and modifications of search direction when suitable safeguards are not satisfied, we propose a strategy based on searches along curve paths: a curve search starting from the first tentative update allows to smoothly revert towards a gradient-related direction if a sufficient decrease condition is not met. The resulting algorithm provably possesses global convergence guarantees, even with a nonmonotone decrease condition. While the presented framework is rather general, particularly of interest is the case of parabolic searches; in this case, under reasonable assumptions, the resulting algorithm can be shown to possess optimal worst case complexity bounds for reaching approximate stationarity in nonconvex settings. Practically, we show that the proposed globalization strategy allows to consistently accept (optimal) pure heavy-ball steps in the strongly convex case, while standard globalization approaches would at times negate them before even evaluating the objective function. Preliminary computational experiments also suggest that the proposed framework might be more convenient than classical safeguard based approaches.

Paper Structure

This paper contains 13 sections, 10 theorems, 52 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

Let $x^k \in \mathbb{R}^n$ and $d_k \in \mathbb{R}^n$ such that $\nabla f(x^k)^\top d_k < 0$. Let $\gamma_k:[0,1]\to\mathbb{R}^n$ be a descent curve for $f$ in $x^k$ such that $\gamma_k(t) = \gamma(t;x^k,d_k,\xi_k)$ and Assumption ass::defcurve is satisfied. Then, the Armijo-type curve search proced

Figures (3)

  • Figure 1: Quadratic search curve starting at $x$, ending at $x+s$ and with initial velocity $d$. The points $P_0=x$, $P_1 = x+\frac{1}{2}d$ and $P_2=x+s$ are the control points of Bézier's expression of the curve.
  • Figure 2: Plot of the distance between the current solution $x^k$ and the optimal solution $x^\star$ of problem \ref{['eq::logistic']} across the iterations for CS, CS_NMT, GD, M_HB, M_RES. Both axes are displayed on a logarithmic scale.
  • Figure 3: Performance profiles in terms of $f^\star$ and $T$ obtained by CS, GD, M_HB, M_RES and M_BETA on the CUTEst problems listed in Table \ref{['tab::problems']}.

Theorems & Definitions (26)

  • Definition 1
  • Definition 2
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Remark 1
  • Proposition 3
  • proof
  • Proposition 4
  • ...and 16 more