Table of Contents
Fetching ...

The rate of convergence of Bregman proximal methods: Local geometry vs. regularity vs. sharpness

Waïss Azizian, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos

TL;DR

This paper analyzes the last-iterate convergence rates of Bregman proximal methods for constrained, possibly non-monotone variational inequalities by introducing the Legendre exponent $\beta$, a local-geometry regulator of the prox function. It shows a sharp dichotomy: when $\beta=0$ the last-iterate converges linearly, while for $\beta>0$ the convergence is generally sublinear, with rates depending on $\beta$ as $D(x^{\ast},x_t)=O(t^{-(1-\beta)/\beta})$. In linearly constrained problems, the authors derive accelerated rates along sharp directions, with cases yielding finite-time convergence for Euclidean kernels, exponential rates for entropic kernels, and power-law rates for Tsallis-type kernels, all tied to the active constraint structure. The results illuminate how local geometry and constraint activity influence algorithmic speed, guiding kernel choice in practice and suggesting directions for future work on non-Euclidean BP methods in broader geometric settings.

Abstract

We examine the last-iterate convergence rate of Bregman proximal methods - from mirror descent to mirror-prox and its optimistic variants - as a function of the local geometry induced by the prox-mapping defining the method. For generality, we focus on local solutions of constrained, non-monotone variational inequalities, and we show that the convergence rate of a given method depends sharply on its associated Legendre exponent, a notion that measures the growth rate of the underlying Bregman function (Euclidean, entropic, or other) near a solution. In particular, we show that boundary solutions exhibit a stark separation of regimes between methods with a zero and non-zero Legendre exponent: the former converge at a linear rate, while the latter converge, in general, sublinearly. This dichotomy becomes even more pronounced in linearly constrained problems where methods with entropic regularization achieve a linear convergence rate along sharp directions, compared to convergence in a finite number of steps under Euclidean regularization.

The rate of convergence of Bregman proximal methods: Local geometry vs. regularity vs. sharpness

TL;DR

This paper analyzes the last-iterate convergence rates of Bregman proximal methods for constrained, possibly non-monotone variational inequalities by introducing the Legendre exponent , a local-geometry regulator of the prox function. It shows a sharp dichotomy: when the last-iterate converges linearly, while for the convergence is generally sublinear, with rates depending on as . In linearly constrained problems, the authors derive accelerated rates along sharp directions, with cases yielding finite-time convergence for Euclidean kernels, exponential rates for entropic kernels, and power-law rates for Tsallis-type kernels, all tied to the active constraint structure. The results illuminate how local geometry and constraint activity influence algorithmic speed, guiding kernel choice in practice and suggesting directions for future work on non-Euclidean BP methods in broader geometric settings.

Abstract

We examine the last-iterate convergence rate of Bregman proximal methods - from mirror descent to mirror-prox and its optimistic variants - as a function of the local geometry induced by the prox-mapping defining the method. For generality, we focus on local solutions of constrained, non-monotone variational inequalities, and we show that the convergence rate of a given method depends sharply on its associated Legendre exponent, a notion that measures the growth rate of the underlying Bregman function (Euclidean, entropic, or other) near a solution. In particular, we show that boundary solutions exhibit a stark separation of regimes between methods with a zero and non-zero Legendre exponent: the former converge at a linear rate, while the latter converge, in general, sublinearly. This dichotomy becomes even more pronounced in linearly constrained problems where methods with entropic regularization achieve a linear convergence rate along sharp directions, compared to convergence in a finite number of steps under Euclidean regularization.
Paper Structure (30 sections, 21 theorems, 117 equations, 2 figures, 2 tables)

This paper contains 30 sections, 21 theorems, 117 equations, 2 figures, 2 tables.

Key Result

Lemma 1

Suppose that $f\colon\mathbb{R}_+\to\mathbb{R}_+$ admits the asymptotic expansion for positive constants $\lambda,r>0$. Then, for $u_{1} > 0$ small enough, the sequence $u_{t+1} = f(u_{t})$, $t=1,2,\dotsc$, converges to $0$ at a rate of $u_{t} \sim (\lambda rt)^{-1/r}$.

Figures (2)

  • Figure 1: The rate of convergence of \ref{['eq:MD']} in ex:Euclex:Hell. The Euclidean and shifted Hellinger regularizers lead to a geometric rate (see left figure); all other examples converge at a polynomial rate.
  • Figure 2: Different boundary solution configurations on the $2$-dimensional unit simplex $\mathcal{X} = \setdef{(x_{1},x_{2},x_{3})\in\mathbb{R}_{+}^{3}}{x_{1} + x_{2} + x_{3} = 1}$ of $\mathbb{R}^{3}$: a non-extreme solution where $g$ is sharp ($\mathcal{A} = \mathcal{A}_{\sharp} = \{*\}{1}$, $\mathcal{A}_{\flat} = \varnothing$; left), an extreme solution where $g$ is not sharp ($\mathcal{A} = \{1,2\}$, $\mathcal{A}_{\sharp} = \{1\}$, $\mathcal{A}_{\flat}=\{*\}{2}$; middle), and a sharp solution ($\mathcal{A} = \mathcal{A}_{\sharp} = \{*\}{1, 2}$, $\mathcal{A}_{\flat} = \varnothing$; right).

Theorems & Definitions (57)

  • Definition 1: Bregman regularizers and related notions
  • Remark 1
  • Example 1: Euclidean regularization
  • Example 2: Entropic regularization
  • Lemma 1
  • Example 3: Fractional power
  • Example 4: Hellinger distance
  • Definition 2
  • Example 5: Non-compatible topologies
  • Theorem 1
  • ...and 47 more