Estimating Sequences with Memory for Minimizing Convex Non-smooth Composite Functions
Endrit Dosti, Sergiy A. Vorobyov, Themistoklis Charalambous
TL;DR
This work develops a memory-augmented generalization of estimating sequences for convex non-smooth composite optimization, enabling acceleration in black-box settings $F(\boldsymbol x)=f(\boldsymbol x)+\tau g(\boldsymbol x)$ with a non-smooth term. The proposed method uses generalized composite estimating sequences that incorporate a memory term $\psi_k$ and a reduced gradient, together with a backtracking line-search that makes the algorithm robust to unknown Lipschitz constants and imperfect strong convexity knowledge. Theoretical results establish an accelerated convergence rate and robustness guarantees, while numerical experiments on quadratic and logistic losses (including real LIBSVM datasets) show improved performance and monotonicity over benchmarks like AMGS and FISTA. The approach is practical for large-scale data processing and can be extended to stochastic, higher-order, or nonconvex settings. Overall, the memory-based framework broadens the applicability and reliability of first-order acceleration in composite convex optimization.
Abstract
First-order optimization methods are crucial for solving large-scale data processing problems, particularly those involving convex non-smooth composite objectives. For such problems with convex non-smooth composite objectives, we introduce a new class of generalized composite estimating sequences, devised by exploiting the information embedded in the iterates generated during the minimization process. Building on these sequences, we propose a novel accelerated first-order method tailored for such objective structures. This method features a backtracking line-search strategy and achieves an accelerated convergence rate, regardless of whether the true Lipschitz constant is known. Additionally, it exhibits robustness to imperfect knowledge of the strong convexity parameter, a property of significant practical importance. The method's efficiency and robustness are substantiated by comprehensive numerical evaluations on both synthetic and real-world datasets, demonstrating its effectiveness in data processing applications.
