Inexact subgradient methods for semialgebraic functions

Jérôme Bolte; Tam Le; Éric Moulines; Edouard Pauwels

Inexact subgradient methods for semialgebraic functions

Jérôme Bolte, Tam Le, Éric Moulines, Edouard Pauwels

TL;DR

This work analyzes inexact (biased) subgradient methods for locally Lipschitz, semialgebraic functions under persistent additive errors. By leveraging a continuous-time differential inclusion framework together with the Kurdyka–Łojasiewicz inequality and metric subregularity, it shows that biased iterates with vanishing or small constant steps approach the epsilon-critical region and remain within $O(\epsilon^{\rho})$ of the true critical set, with $\rho$ reflecting geometric properties. In the convex case, a universal error bound yields explicit complexity guarantees for averaged iterates, extending classical results to inexact settings without requiring compactness. The paper also develops descent-type lemmas and an invariance principle linking discrete-time updates to continuous-time dynamics, with extensions to o-minimal definable settings and potential stochastic generalizations. These results provide rigorous performance guarantees for optimization routines that rely on inexact subgradient information, which is common in large-scale machine learning and nonconvex nonsmooth optimization contexts.

Abstract

Motivated by the extensive application of approximate gradients in machine learning and optimization, we investigate inexact subgradient methods subject to persistent additive errors. Within a nonconvex semialgebraic framework, assuming boundedness or coercivity, we establish that the method yields iterates that eventually fluctuate near the critical set at a proximity characterized by an $O(ε^ρ)$ distance, where $ε$ denotes the magnitude of subgradient evaluation errors, and $ρ$ encapsulates geometric characteristics of the underlying problem. Our analysis comprehensively addresses both vanishing and constant step-size regimes. Notably, the latter regime inherently enlarges the fluctuation region, yet this enlargement remains on the order of $ε^ρ$. In the convex scenario, employing a universal error bound applicable to coercive semialgebraic functions, we derive novel complexity results concerning averaged iterates. Additionally, our study produces auxiliary results of independent interest, including descent-type lemmas for nonsmooth nonconvex functions and an invariance principle governing the behavior of algorithmic sequences under small-step limits.

Inexact subgradient methods for semialgebraic functions

TL;DR

of the true critical set, with

reflecting geometric properties. In the convex case, a universal error bound yields explicit complexity guarantees for averaged iterates, extending classical results to inexact settings without requiring compactness. The paper also develops descent-type lemmas and an invariance principle linking discrete-time updates to continuous-time dynamics, with extensions to o-minimal definable settings and potential stochastic generalizations. These results provide rigorous performance guarantees for optimization routines that rely on inexact subgradient information, which is common in large-scale machine learning and nonconvex nonsmooth optimization contexts.

Abstract

distance, where

denotes the magnitude of subgradient evaluation errors, and

encapsulates geometric characteristics of the underlying problem. Our analysis comprehensively addresses both vanishing and constant step-size regimes. Notably, the latter regime inherently enlarges the fluctuation region, yet this enlargement remains on the order of

. In the convex scenario, employing a universal error bound applicable to coercive semialgebraic functions, we derive novel complexity results concerning averaged iterates. Additionally, our study produces auxiliary results of independent interest, including descent-type lemmas for nonsmooth nonconvex functions and an invariance principle governing the behavior of algorithmic sequences under small-step limits.

Paper Structure (19 sections, 20 theorems, 49 equations)

This paper contains 19 sections, 20 theorems, 49 equations.

Introduction
Preliminaries and statement of the main results
Notations.
Main results
The nonconvex setting
The convex setting
Reading keys and natural extensions
The continuous-time system and auxiliary results
Asymptotics of the biased dynamics
Estimates under the nonsmooth KL inequality and a metric subregularity condition
Coercivity and boundedness of $\epsilon$ critical points
Main consequences for the biased subgradient method
Link between discrete and continuous-time
Descent lemmas
Vanishing step sizes
...and 4 more sections

Key Result

Theorem 2.1

Under ass:mainSemiAlgebraic, there is $\bar{\epsilon} >0$, $C > 0$$\rho > 0$ such that for any $\epsilon < \bar{\epsilon}$, $x_0 \in \mathbb{R}^p$, there is $\bar{\alpha} > 0$, such that for any $(x_k)_{k \in \mathbb{N}}$ given by eq:subgradVanishingStepSize with $0<\alpha_k \leq \bar{\alpha}$ for a

Theorems & Definitions (27)

Theorem 2.1: Convergence for biased subgradient method with vanishing step size
Theorem 2.2: Convergence for biased subgradient method with constant step size
Remark 2.1: Fluctuations with constant step sizes
Remark 2.2: Local Lipschitz continuity
Theorem 2.3: Biased subgradient complexity: convex case
Lemma 3.1: Descent properties
Remark 3.1
Lemma 3.2: Semialgebraicity implies regularity
Remark 3.2: Beyond semialgebraicity
Lemma 3.3: Approximate stationarity of near-critical curves
...and 17 more

Inexact subgradient methods for semialgebraic functions

TL;DR

Abstract

Inexact subgradient methods for semialgebraic functions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (27)