Restart-Free (Accelerated) Gradient Sliding Methods for Strongly Convex Composite Optimization
Xinming Wu, Zi Xu, Huiling Zhang
TL;DR
The paper addresses composite convex optimization with a smooth component $f$, a nonsmooth component $h$, and a proximable term $\chi$, proposing restart-free stochastic gradient sliding (RF-SGS) to circumvent restart overhead. For structured max-form nonsmooth terms, it introduces RF-ASGS, a restart-free accelerated scheme for bilinear saddle-point problems, using smooth approximations $h_\eta$ with carefully chosen parameters. The authors prove that RF-SGS achieves $\epsilon$-solutions with $O\left(\log\frac{1}{\epsilon}\right)$ gradient evaluations of $\nabla f$ and $O\left(\dfrac{1}{\epsilon}\right)$ subgradient evaluations of $h'$, while RF-ASGS attains $O\left(\sqrt{L/\mu}\,\log\frac{1}{\epsilon}\right)$ gradient evaluations of $\nabla f$ and $O\left(\|K\|/\sqrt{\epsilon}\right)$ operator evaluations for $K$ and $K^T$. Numerical experiments on portfolio optimization and total-variation denoising demonstrate smoother convergence and competitive performance relative to restart-based methods, highlighting the practical appeal of the proposed restart-free framework.
Abstract
In this paper, we study a class of composite optimization problems whose objective function is given by the summation of a general smooth and nonsmooth component, together with a relatively simple nonsmooth term. While restart strategies are commonly employed in first-order methods to achieve optimal convergence under strong convexity, they introduce structural complexity and practical overhead, making algorithm design and nesting cumbersome. To address this, we propose a \emph{restart-free} stochastic gradient sliding algorithm that eliminates the need for explicit restart phases when the simple nonsmooth component is strongly convex. Through a novel and carefully designed parameter selection strategy, we prove that the proposed algorithm achieves an $ε$-solution with only $\mathcal{O}(\log(\frac{1}ε))$ gradient evaluations for the smooth component and $\mathcal{O}(\frac{1}ε)$ stochastic subgradient evaluations for the nonsmooth component, matching the optimal complexity of existing multi-phase (restart-based) methods. Moreover, for the case where the nonsmooth component is structured, allowing the overall problem to be reformulated as a bilinear saddle-point problem, we develop a restart-free accelerated stochastic gradient sliding algorithm. We show that the resulting method requires only $\mathcal{O}(\log(\frac{1}ε))$ gradient computations for the smooth component while preserving an overall iteration complexity of $\mathcal{O}(\frac{1}{\sqrtε})$ for solving the corresponding saddle-point problems. Our work thus provides simpler, restart-f
