Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity

Xiao Li; Lei Zhao; Daoli Zhu; Anthony Man-Cho So

Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity

Xiao Li, Lei Zhao, Daoli Zhu, Anthony Man-Cho So

TL;DR

This work extends the typical iteration complexity results for the subgradient method to cover non-Lipschitz convex and weakly convex minimization and provides convergence results for the subgradient method in the non-Lipschitz setting when proper diminishing rules on the step size are used.

Abstract

The subgradient method is one of the most fundamental algorithmic schemes for nonsmooth optimization. The existing complexity and convergence results for this method are mainly derived for Lipschitz continuous objective functions. In this work, we first extend the typical iteration complexity results for the subgradient method to cover non-Lipschitz convex and weakly convex minimization. Specifically, for the convex case, we can drive the suboptimality gap to below $\varepsilon$ in $\mathcal{O}( \varepsilon^{-2} )$ iterations; for the weakly convex case, we can drive the gradient norm of the Moreau envelope of the objective function to below $\varepsilon$ in $\mathcal{O}( \varepsilon^{-4} )$ iterations. Then, we provide convergence results for the subgradient method in the non-Lipschitz setting when proper diminishing rules on the step size are used. In particular, when $f$ is convex, we establish an $\mathcal{O}(\log(k)/\sqrt{k})$ rate of convergence in terms of the suboptimality gap, where $k$ represents the iteration count. With an additional quadratic growth property, the rate is improved to $\mathcal{O}(1/k)$ in terms of the squared distance to the optimal solution set. When $f$ is weakly convex, asymptotic convergence is established. Our results neither require any modification to the subgradient method nor impose any growth condition on the subgradients, while our analysis is surprisingly simple. To further illustrate the wide applicability of our framework, we extend the aforementioned iteration complexity results to cover the truncated subgradient, the stochastic subgradient, and the proximal subgradient methods for non-Lipschitz convex / weakly convex objective functions.

Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity

TL;DR

Abstract

iterations; for the weakly convex case, we can drive the gradient norm of the Moreau envelope of the objective function to below

iterations. Then, we provide convergence results for the subgradient method in the non-Lipschitz setting when proper diminishing rules on the step size are used. In particular, when

is convex, we establish an

rate of convergence in terms of the suboptimality gap, where

represents the iteration count. With an additional quadratic growth property, the rate is improved to

in terms of the squared distance to the optimal solution set. When

is weakly convex, asymptotic convergence is established. Our results neither require any modification to the subgradient method nor impose any growth condition on the subgradients, while our analysis is surprisingly simple. To further illustrate the wide applicability of our framework, we extend the aforementioned iteration complexity results to cover the truncated subgradient, the stochastic subgradient, and the proximal subgradient methods for non-Lipschitz convex / weakly convex objective functions.

Paper Structure (19 sections, 12 theorems, 84 equations)

This paper contains 19 sections, 12 theorems, 84 equations.

Introduction
The context and our goal.
New Iteration Complexity Bounds for the Subgradient Method
Convergence results using diminishing step size rules.
Our idea and extensions.
Prior Arts
Basic Notions in Nonsmooth Optimization
Proofs of Complexity Results
Proof of \ref{['theo:complexity_cvx']}
Proof of \ref{['theo:complexity wcvx']}
Convergence Results with Diminishing Step Size Rules
Convergence Results for Convex Case
Recovery of Shor's convergence result.
Convergence Result for Weakly Convex Case
Applications to Some Variants of the Subgradient Method
...and 4 more sections

Key Result

Theorem 1

Suppose that $f$ in eq:problem is convex and the step sizes $\{\alpha_k\}_{k= 0}^T$ satisfy where $c>0$ is some constant and $T$ is the pre-determined total number of iterations. Let $\tilde{x}^T=\frac{1}{T+1}\sum_{k=0}^T x^k$ be the averaged iterate and $x^*\in \mathcal{X}^*$ be some fixed optimal solution to problem eq:problem. Then, the trajectory $\{x^k\}_{k=0}^T$ lies in the ball $\ma where

Theorems & Definitions (23)

Theorem 1: complexity of $\mathsf{SubGrad}$ for convex case
Theorem 2: complexity of $\mathsf{SubGrad}$ for weakly convex case
Lemma 1: basic recursion for $\mathsf{SubGrad}$
proof
Lemma 2
proof
proof : Proof of \ref{['theo:complexity wcvx']}
Corollary 1: convergence rate of $\mathsf{SubGrad}$ for convex case
proof
Corollary 2: convergence rate of $\mathsf{SubGrad}$ for convex case with quadratic growth property
...and 13 more

Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity

TL;DR

Abstract

Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (23)