A New Lineserach for Accelerated Composite Minimization

Reza Rahimi Baghbadorani; Sergio Grammatico; Peyman Mohajerin Esfahani

A New Lineserach for Accelerated Composite Minimization

Reza Rahimi Baghbadorani, Sergio Grammatico, Peyman Mohajerin Esfahani

TL;DR

This work addresses the long-standing challenge of selecting stepsizes in first-order convex optimization without relying on a known global smoothness constant. It introduces a novel zero-order linesearch that relies only on function evaluations, applied to both non-accelerated and accelerated gradient methods through a gradient-mapping framework for composite objectives. The authors prove convergence guarantees, achieving O(1/k) for non-accelerated and O(1/k^2) for accelerated schemes, and demonstrate near-optimal performance on smooth, composite, and non-convex problems. The approach is hyperparameter-free for the composite setting and shows strong empirical performance across diverse problem classes, suggesting broad practical impact for large-scale optimization tasks.

Abstract

The choice of the stepsize in first-order convex optimization is typically based on the smoothness constant and plays a crucial role in the performance of algorithms. Recently, there has been a resurgent interest in introducing adaptive stepsizes that do not explicitly depend on smooth constant. In this paper, we propose a novel linesearch stepsize rule based on function evaluations (i.e., zero-order information) that enjoys provable convergence guarantees for both accelerated and non-accelerated gradient descent. We further discuss the similarities and differences between the proposed stepsize regimes and the existing stepsize rules (including Polyak and Armijo). We numerically benchmark the performance of our proposed algorithms against state-of-the-art methods across three major problems classes of (1) smooth minimization (logistic regression, quadratic programs, log-sum-exponential, and smooth max-cut relaxation) (2) composite minimization ($\ell_1$-regularized least-squares, $\ell_1$-constrained least-squares, and $\ell_1$-regularized logistic regression), and (3) non-convex minimization (cubic minimization). These classes include a wide range of operations research and management applications such as portfolio optimization, discrete choice models, sparse classification and feature selections, high-order optimization and trust-region subproblems.

A New Lineserach for Accelerated Composite Minimization

TL;DR

Abstract

-regularized least-squares,

-constrained least-squares, and

-regularized logistic regression), and (3) non-convex minimization (cubic minimization). These classes include a wide range of operations research and management applications such as portfolio optimization, discrete choice models, sparse classification and feature selections, high-order optimization and trust-region subproblems.

Paper Structure (19 sections, 6 theorems, 57 equations, 6 figures, 2 tables)

This paper contains 19 sections, 6 theorems, 57 equations, 6 figures, 2 tables.

Introduction
Non-Accelerated Adaptive Stepsize
Preliminaries
Non-accelerated composite minimization
Non-accelerated smooth minimization
Comparison of different stepsizes regime:
Accelerated Adaptive Stepsize
Numerical Results
Smooth minimization
(i) Logistic regression:
(ii) Quadratic programming:
(iii) Log-Sum-Exp:
(vi) Approximate semidefinite programming:
Composite minimization
(i) $\ell_1$-Regularized least square neykov2016l1selesnick2017sparse:
...and 4 more sections

Key Result

Lemma 2.2

Let $G^{f}_{\lambda h}(x)$ be the gradient mapping defined in grad_mapping for a smooth convex function $f$, a possibly nonsmooth function $h$, and a positive constant $\lambda$ in $\mathbb{R}_+$.

Figures (6)

Figure 1: Geometric interpretation of different stepsize rules using $\phi_k(\lambda)$ defined in \ref{['phi func']}.
Figure 2: Initial choice of $\lambda_0$ at the $({k+1})^{\text{th}}$ iteration.
Figure 3: The results for the class (1) smooth minimization. The first row shows the optimality gap, and the second row shows the stepsize behavior.
Figure 4: Approximate maximum eigenvalue \ref{['regularized maxcut dual']}. The first row shows the optimality gap and the second row shows the stepsize behavior.
Figure 5: The results for the class (2) composite minimization. The first row shows the optimality gap, and the second row shows the stepsize behavior.
...and 1 more figures

Theorems & Definitions (10)

Lemma 2.2: Gradient mapping
Theorem 2.3: Non-accelerated adaptive stepsize
proof
Corollary 2.4: Locally smooth function
proof
Remark 2.5: Approximate adaptive stepsize rule
Corollary 2.6: Convergence of smooth minimization
Theorem 3.1: Accelerated adaptive stepsize
proof
Corollary 3.2: Accelerated smooth minimization

A New Lineserach for Accelerated Composite Minimization

TL;DR

Abstract

A New Lineserach for Accelerated Composite Minimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (10)