On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithms
Puya Latafat, Andreas Themelis, Panagiotis Patrinos
TL;DR
This paper tackles first-order convex optimization with nonsmooth terms by developing a linesearch-free adaptive framework for proximal gradient methods. It introduces adaPG$^{q,r}$, a two-parameter scheme that permits larger stepsizes and tighter lower bounds through backward-looking Lipschitz estimates, along with a general convergence theory for time-varying parameters. It also extends the idea to adaptive alternating minimization (AMA) via a dual formulation, resulting in AdaAMA$^{q,r}$ that relaxes strong convexity to local strong convexity and broadens applicability. The combination of a unified analytical framework and dualized adaptive AMA yields practical, flexible algorithms with validated performance in numerical experiments and promising directions for nonconvex and bilevel extensions.
Abstract
Building upon recent works on linesearch-free adaptive proximal gradient methods, this paper proposes adaPG$^{q,r}$, a framework that unifies and extends existing results by providing larger stepsize policies and improved lower bounds. Different choices of the parameters $q$ and $r$ are discussed and the efficacy of the resulting methods is demonstrated through numerical simulations. In an attempt to better understand the underlying theory, its convergence is established in a more general setting that allows for time-varying parameters. Finally, an adaptive alternating minimization algorithm is presented by exploring the dual setting. This algorithm not only incorporates additional adaptivity, but also expands its applicability beyond standard strongly convex settings.
