Table of Contents
Fetching ...

An Abstract Lyapunov Control Optimizer: Local Stabilization and Global Convergence

Bilel Bensaid, Gaël Poëtte, Rodolphe Turpault

TL;DR

This work treats adaptive step-size optimizers as discretizations of ODEs and introduces Lyapunov-based backtracking schemes, LCR and LCM, that enforce a dissipation condition $V(y_{n+1})-V(y_n) \leq \lambda \eta_n \dot{V}(y_n)$ at each step. To overcome the non-smooth, non-continuous update of the learning rate, the authors deploy selection theory to obtain a continuous selector $s$ on $\mathbb{R}^m\setminus\mathcal{Z}$, enabling a discrete LaSalle-type analysis of limit points. They prove local stability around isolated stationary points and, under KL assumptions, stability and attractivity of the global minima set, with concrete implications for GD, RMSProp, and p-GD. A general convergence framework shows that, when trajectories stay bounded, LCR/LCM converge to the set where $\dot{V}=0$, with rates characterized by KL exponents; the results hold for both constant and adaptive step policies and yield subexponential, exponential, or finite-time convergence depending on the problem structure. Overall, the paper provides rigorous guarantees that adaptive Lyapunov-based step-size rules preserve the qualitative dynamics of the underlying continuous system and deliver global convergence without requiring precise hyperparameter tuning.

Abstract

Recently, many machine learning optimizers have been analysed considering them as the asymptotic limit of some differential equations when the step size goes to zero. In other words, the optimizers can be seen as a finite difference scheme applied to a continuous dynamical system. But the major part of the results in the literature concerns constant step size algorithms. The main aim of this paper is to investigate the guarantees of the adaptive step size counterpart. In fact, this dynamical point of view can be used to design step size update rules, by choosing a discretization of the continuous equation that preserves its most relevant features. In this work, we analyse this kind of adaptive optimizers and prove their Lyapunov stability and convergence properties for any choice of hyperparameters. At the best of our knowledge, this paper introduces for the first time the use of continuous selection theory from general topology to overcome some of the intrinsic difficulties due to the non constant and non regular step size policies. The general framework developed gives many new results on adaptive and constant step size Momentum/Heavy-Ball and p-GD algorithms.

An Abstract Lyapunov Control Optimizer: Local Stabilization and Global Convergence

TL;DR

This work treats adaptive step-size optimizers as discretizations of ODEs and introduces Lyapunov-based backtracking schemes, LCR and LCM, that enforce a dissipation condition at each step. To overcome the non-smooth, non-continuous update of the learning rate, the authors deploy selection theory to obtain a continuous selector on , enabling a discrete LaSalle-type analysis of limit points. They prove local stability around isolated stationary points and, under KL assumptions, stability and attractivity of the global minima set, with concrete implications for GD, RMSProp, and p-GD. A general convergence framework shows that, when trajectories stay bounded, LCR/LCM converge to the set where , with rates characterized by KL exponents; the results hold for both constant and adaptive step policies and yield subexponential, exponential, or finite-time convergence depending on the problem structure. Overall, the paper provides rigorous guarantees that adaptive Lyapunov-based step-size rules preserve the qualitative dynamics of the underlying continuous system and deliver global convergence without requiring precise hyperparameter tuning.

Abstract

Recently, many machine learning optimizers have been analysed considering them as the asymptotic limit of some differential equations when the step size goes to zero. In other words, the optimizers can be seen as a finite difference scheme applied to a continuous dynamical system. But the major part of the results in the literature concerns constant step size algorithms. The main aim of this paper is to investigate the guarantees of the adaptive step size counterpart. In fact, this dynamical point of view can be used to design step size update rules, by choosing a discretization of the continuous equation that preserves its most relevant features. In this work, we analyse this kind of adaptive optimizers and prove their Lyapunov stability and convergence properties for any choice of hyperparameters. At the best of our knowledge, this paper introduces for the first time the use of continuous selection theory from general topology to overcome some of the intrinsic difficulties due to the non constant and non regular step size policies. The general framework developed gives many new results on adaptive and constant step size Momentum/Heavy-Ball and p-GD algorithms.
Paper Structure (12 sections, 17 theorems, 135 equations, 2 algorithms)

This paper contains 12 sections, 17 theorems, 135 equations, 2 algorithms.

Key Result

Proposition 1

Let $\mathcal{R}$ be differentiable and assume its gradient is locally Lipschitz. Consider the sequence $(\theta_n)_{n\in\mathbb{N}}$ generated by the algorithm LCM with $F=\nabla \mathcal{R}$ and $V=\mathcal{R}$ and assume that $(\theta_n)_{n\in\mathbb{N}}$ is bounded. Then the set of accumulation

Theorems & Definitions (47)

  • Proposition 1: GD limit set
  • proof
  • Theorem 1: Selection Theorem
  • Lemma 1
  • proof
  • Lemma 2: Increasing implicit function lemma
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • ...and 37 more