Table of Contents
Fetching ...

Optimal Algorithms for Stochastic Complementary Composite Minimization

Alexandre d'Aspremont, Cristóbal Guzmán, Clément Lezane

TL;DR

This paper addresses stochastic complementary composite minimization, where the objective is $\Psi(x)=F(x)+H(x)$ with a weakly smooth stochastic $F$ and a uniformly convex (possibly nonsmooth) $H$. It introduces two stochastic mirror-descent algorithms, NACSMD (non-accelerated) and ACSMD (accelerated), augmented by a restarting scheme to achieve linear convergence up to statistical noise, and provides both in-expectation and high-probability guarantees. The authors derive matching lower bounds (deterministic and stochastic) to establish near-optimality and demonstrate the efficacy of their methods through numerical experiments on generalized ridge regression, highlighting robustness to mis-specified smoothness and clear acceleration benefits. The results advance the understanding of stochastic optimization with uniform convexity and offer practically efficient schemes with flexible step-size schedules and strong probabilistic guarantees. Overall, the work delivers novel upper and lower complexity bounds, high-probability estimates, and practical algorithms for a broad class of stochastic, structured regularization problems.

Abstract

Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization in the stochastic setting. This problem corresponds to the minimization of the sum of a (weakly) smooth function endowed with a stochastic first-order oracle, and a structured uniformly convex (possibly nonsmooth and non-Lipschitz) regularization term. Despite intensive work on closely related settings, prior to our work no complexity bounds for this problem were known. We close this gap by providing novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems. We conclude by providing numerical results comparing our methods to the state of the art.

Optimal Algorithms for Stochastic Complementary Composite Minimization

TL;DR

This paper addresses stochastic complementary composite minimization, where the objective is with a weakly smooth stochastic and a uniformly convex (possibly nonsmooth) . It introduces two stochastic mirror-descent algorithms, NACSMD (non-accelerated) and ACSMD (accelerated), augmented by a restarting scheme to achieve linear convergence up to statistical noise, and provides both in-expectation and high-probability guarantees. The authors derive matching lower bounds (deterministic and stochastic) to establish near-optimality and demonstrate the efficacy of their methods through numerical experiments on generalized ridge regression, highlighting robustness to mis-specified smoothness and clear acceleration benefits. The results advance the understanding of stochastic optimization with uniform convexity and offer practically efficient schemes with flexible step-size schedules and strong probabilistic guarantees. Overall, the work delivers novel upper and lower complexity bounds, high-probability estimates, and practical algorithms for a broad class of stochastic, structured regularization problems.

Abstract

Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization in the stochastic setting. This problem corresponds to the minimization of the sum of a (weakly) smooth function endowed with a stochastic first-order oracle, and a structured uniformly convex (possibly nonsmooth and non-Lipschitz) regularization term. Despite intensive work on closely related settings, prior to our work no complexity bounds for this problem were known. We close this gap by providing novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems. We conclude by providing numerical results comparing our methods to the state of the art.
Paper Structure (28 sections, 13 theorems, 94 equations, 2 figures, 3 tables, 3 algorithms)

This paper contains 28 sections, 13 theorems, 94 equations, 2 figures, 3 tables, 3 algorithms.

Key Result

Lemma 4

\newlabellem:prox0 Let $f$ be a convex function and $\nu$ be convex and continuously differentiable. If we consider then for all $u$:

Figures (2)

  • Figure 1: Performance comparison between Algorithms \ref{['Algo NACSMD']}, \ref{['Algo ACSMD']} and the one suggested in GL12II with the restarting scheme \ref{['Restarting Algorithm']}. We evaluate the decreasing speed of the log relative error through iterations for an extended Ridge regression problem.
  • Figure 2: Performance comparison for Algorithm \ref{['Algo ACSMD']} with different polynomial degrees as described before without the restarting scheme. We evaluate the decreasing speed of the log relative error through iterations for an extended Ridge regression problem.

Theorems & Definitions (29)

  • Definition 1: Uniform convexity
  • Example 1
  • Definition 2: Weak smoothness
  • Definition 3: Bregman Divergence
  • Lemma 4
  • Proof 1
  • Lemma 5
  • Proof 2
  • Corollary 6
  • Lemma 7
  • ...and 19 more