A Unified Theory of Stochastic Proximal Point Methods without Smoothness
Peter Richtárik, Abdurakhmon Sadiev, Yury Demidovich
TL;DR
This work provides a unified, smoothness-free theory for stochastic proximal point methods (SPPM) by introducing a universal SPPM-LC algorithm with learned corrections. A parametric sigma_k^2 framework yields a single linear convergence theorem that covers standard SPPM, variance-reduced variants, and new algorithms under μ-strong convexity. The analysis recovers best-known rates for existing methods, introduces five novel variants, and demonstrates their practical behavior through numerical experiments. The framework offers a robust, tuning-insensitive approach to stochastic optimization with proximal updates, and sets the stage for extensions to distributed, compressed, or nonconvex settings.
Abstract
This paper presents a comprehensive analysis of a broad range of variations of the stochastic proximal point method (SPPM). Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning, a trait not shared by the dominant stochastic gradient descent (SGD) algorithm. A framework of assumptions that we introduce encompasses methods employing techniques such as variance reduction and arbitrary sampling. A cornerstone of our general theoretical approach is a parametric assumption on the iterates, correction and control vectors. We establish a single theorem that ensures linear convergence under this assumption and the $μ$-strong convexity of the loss function, and without the need to invoke smoothness. This integral theorem reinstates best known complexity and convergence guarantees for several existing methods which demonstrates the robustness of our approach. We expand our study by developing three new variants of SPPM, and through numerical experiments we elucidate various properties inherent to them.
