Table of Contents
Fetching ...

A Smooth Approximation Framework for Weakly Convex Optimization

Qi Deng, Wenzhi Gao

TL;DR

The paper tackles weakly convex optimization where the objective is the sum of a nonsmooth weakly convex part and a convex, prox-friendly part. It develops a unified smooth approximation framework that encompasses Moreau envelope smoothing and generalized Nesterov smoothing, enabling explicit complexity analyses even when the objective is not globally Lipschitz. By embedding smooth approximations into an inexact proximal point scheme and leveraging accelerated solvers with line search, it achieves deterministic $\mathcal{O}(1/\varepsilon^3)$ and stochastic $\mathcal{O}(\max\{1/\varepsilon^3,1/(m\varepsilon^4)\})$ guarantees, while relaxing global Lipschitz assumptions. The approach also extends smoothing to composite structures and non-Lipschitz settings, with numerical experiments showing smoother convergence and robustness against subgradient methods. Overall, the framework provides a flexible, scalable toolkit for designing smooth-approximation algorithms with provable guarantees for a broad class of weakly convex problems.

Abstract

Standard complexity analyses for weakly convex optimization rely on the Moreau envelope technique proposed by Davis and Drusvyatskiy (2019). The main insight is that nonsmooth algorithms, such as proximal subgradient, proximal point, and their stochastic variants, implicitly minimize a smooth surrogate function induced by the Moreau envelope. Meanwhile, explicit smoothing, which directly minimizes a smooth approximation of the objective, has long been recognized as an efficient strategy for nonsmooth optimization. In this paper, we generalize the notion of smoothable functions, which was proposed by Beck and Teboulle (2012) for nonsmooth convex optimization. This generalization provides a unified viewpoint on several important smoothing techniques for weakly convex optimization, including Nesterov-type smoothing and Moreau envelope smoothing. Our theory yields a framework for designing smooth approximation algorithms for both deterministic and stochastic weakly convex problems with provable complexity guarantees. Furthermore, our theory extends to the smooth approximation of non-Lipschitz functions, allowing for complexity analysis even when global Lipschitz continuity does not hold.

A Smooth Approximation Framework for Weakly Convex Optimization

TL;DR

The paper tackles weakly convex optimization where the objective is the sum of a nonsmooth weakly convex part and a convex, prox-friendly part. It develops a unified smooth approximation framework that encompasses Moreau envelope smoothing and generalized Nesterov smoothing, enabling explicit complexity analyses even when the objective is not globally Lipschitz. By embedding smooth approximations into an inexact proximal point scheme and leveraging accelerated solvers with line search, it achieves deterministic and stochastic guarantees, while relaxing global Lipschitz assumptions. The approach also extends smoothing to composite structures and non-Lipschitz settings, with numerical experiments showing smoother convergence and robustness against subgradient methods. Overall, the framework provides a flexible, scalable toolkit for designing smooth-approximation algorithms with provable guarantees for a broad class of weakly convex problems.

Abstract

Standard complexity analyses for weakly convex optimization rely on the Moreau envelope technique proposed by Davis and Drusvyatskiy (2019). The main insight is that nonsmooth algorithms, such as proximal subgradient, proximal point, and their stochastic variants, implicitly minimize a smooth surrogate function induced by the Moreau envelope. Meanwhile, explicit smoothing, which directly minimizes a smooth approximation of the objective, has long been recognized as an efficient strategy for nonsmooth optimization. In this paper, we generalize the notion of smoothable functions, which was proposed by Beck and Teboulle (2012) for nonsmooth convex optimization. This generalization provides a unified viewpoint on several important smoothing techniques for weakly convex optimization, including Nesterov-type smoothing and Moreau envelope smoothing. Our theory yields a framework for designing smooth approximation algorithms for both deterministic and stochastic weakly convex problems with provable complexity guarantees. Furthermore, our theory extends to the smooth approximation of non-Lipschitz functions, allowing for complexity analysis even when global Lipschitz continuity does not hold.

Paper Structure

This paper contains 58 sections, 30 theorems, 181 equations, 5 figures, 2 tables, 3 algorithms.

Key Result

Lemma 2.1

The Moreau envelope $f^{\beta}$ defined in eq:moreau-envelope is differentiable with gradient

Figures (5)

  • Figure 1: Deterministic problems. First row: $h(z) = z^2$; Second row: $h(z) = e^x + 10$; Within each row from left to right: $(\kappa, p) \in \{(1, 0), (10, 0), (1, 0.2), (10, 0.2)\}$. x-axis: iteration number. y-axis: $f(\mathbf{x}^k)$ .
  • Figure 2: Experiments comparing the range of optimal stepsize for different smoothing approaches. x-axis: $\alpha_0$. y-axis: number of iterations to reach the stopping criterion.
  • Figure 3: Stochastic problems. First row: $h(z) = z^2$; Second row: $h(z) = z^5 + z^3 + 1$. Within each row from left to right: $(\kappa, p) \in \{(1, 0), (10, 0), (1, 0.2), (10, 0.2)\}$. x-axis: iteration number. y-axis: $f(\mathbf{x}^k)$.
  • Figure 4: Experiments comparing robustness of different smoothing approaches. $x$-axis: $\alpha_0$. $y$-axis: number of iterations to reach the stopping criterion.
  • Figure 5: Experiments on the comparison between AGLS (AGLS-SIPP) and NSGM for generalized smooth problems. The first row corresponds to $\sigma = 0$ and the second row corresponds to $\sigma = 1$. From left to right: $(m, d) \in \{(5, 20), (5, 100), (10, 20), (10, 100)\}$. $x$-axis: iteration number. $y$-axis: $f(\mathbf{x}^k)$ (or $f(\mathbf{x}^k) - f^\star$ for convex problems) .

Theorems & Definitions (68)

  • Definition 2.1: Approximate stationarity
  • Lemma 2.1
  • Definition 2.2: Generalized smoothness
  • Remark 2.1
  • Lemma 2.2
  • Proposition 2.1
  • Definition 3.1
  • Definition 3.2: Approximate subgradient
  • Proposition 3.1
  • proof
  • ...and 58 more