The rate of convergence of Bregman proximal methods: Local geometry vs. regularity vs. sharpness
Waïss Azizian, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos
TL;DR
This paper analyzes the last-iterate convergence rates of Bregman proximal methods for constrained, possibly non-monotone variational inequalities by introducing the Legendre exponent $\beta$, a local-geometry regulator of the prox function. It shows a sharp dichotomy: when $\beta=0$ the last-iterate converges linearly, while for $\beta>0$ the convergence is generally sublinear, with rates depending on $\beta$ as $D(x^{\ast},x_t)=O(t^{-(1-\beta)/\beta})$. In linearly constrained problems, the authors derive accelerated rates along sharp directions, with cases yielding finite-time convergence for Euclidean kernels, exponential rates for entropic kernels, and power-law rates for Tsallis-type kernels, all tied to the active constraint structure. The results illuminate how local geometry and constraint activity influence algorithmic speed, guiding kernel choice in practice and suggesting directions for future work on non-Euclidean BP methods in broader geometric settings.
Abstract
We examine the last-iterate convergence rate of Bregman proximal methods - from mirror descent to mirror-prox and its optimistic variants - as a function of the local geometry induced by the prox-mapping defining the method. For generality, we focus on local solutions of constrained, non-monotone variational inequalities, and we show that the convergence rate of a given method depends sharply on its associated Legendre exponent, a notion that measures the growth rate of the underlying Bregman function (Euclidean, entropic, or other) near a solution. In particular, we show that boundary solutions exhibit a stark separation of regimes between methods with a zero and non-zero Legendre exponent: the former converge at a linear rate, while the latter converge, in general, sublinearly. This dichotomy becomes even more pronounced in linearly constrained problems where methods with entropic regularization achieve a linear convergence rate along sharp directions, compared to convergence in a finite number of steps under Euclidean regularization.
