An Efficient Stochastic First-Order Algorithm for Nonconvex-Strongly Concave Minimax Optimization beyond Lipschitz Smoothness

Yan Gao; Yongchao Liu

An Efficient Stochastic First-Order Algorithm for Nonconvex-Strongly Concave Minimax Optimization beyond Lipschitz Smoothness

Yan Gao, Yongchao Liu

TL;DR

This paper studies stochastic minimax problems under a generalized smoothness condition and proposes an algorithm, NSGDA-M, which simultaneously updates the inner variable by stochastic gradient ascent and updates the outer variable by normalized stochastic gradient descent with momentum.

Abstract

In recent years, nonconvex minimax problems have attracted significant attention due to their broad applications in machine learning, including generative adversarial networks, robust optimization and adversarial training. Most existing algorithms for nonconvex stochastic minimax problems are developed under the standard Lipschitz smoothness assumption. In this paper, we study stochastic minimax problems under a generalized smoothness condition and propose an algorithm, NSGDA-M, which simultaneously updates the inner variable by stochastic gradient ascent and updates the outer variable by normalized stochastic gradient descent with momentum. When the objective function is nonconvex-strongly concave, we show that NSGDA-M finds an $ε$-stationary point of the primal function within $\mathcal{O}(ε^{-4})$ stochastic gradient evaluations in expectation, and $\mathcal{O}\left(ε^{-4}(\log(\frac{1}δ))^{3/2}\right)$ stochastic gradient evaluations in high probability, where $δ\in (0,1)$ is the failure probability. We verify the effectiveness of the proposed algorithm through numerical experiments on a distributionally robust optimization problem.

An Efficient Stochastic First-Order Algorithm for Nonconvex-Strongly Concave Minimax Optimization beyond Lipschitz Smoothness

TL;DR

Abstract

-stationary point of the primal function within

stochastic gradient evaluations in expectation, and

stochastic gradient evaluations in high probability, where

is the failure probability. We verify the effectiveness of the proposed algorithm through numerical experiments on a distributionally robust optimization problem.

Paper Structure (6 sections, 9 theorems, 146 equations, 1 figure, 1 table, 2 algorithms)

This paper contains 6 sections, 9 theorems, 146 equations, 1 figure, 1 table, 2 algorithms.

Introduction
Preliminaries and Algorithm
Convergence Analysis
Convergence analysis of NSGDA-M in expectation
Convergence analysis of NSGDA-M in high probability
Numerical Experiments

Key Result

Lemma 1

Suppose Assumptions RS_ass_PhiLowerBound--RS_ass_gradyStarBound hold. For any $x$ and $x'$ that satisfy $\|x' - x\| \leq \frac{1}{L_{x,1}}$, we have where $\kappa=:\frac{L_{y,0} + L_{y,1}B}{\mu}$.

Figures (1)

Figure 1: Convergence of NSGDA-M, NSGDA and SGDA

Theorems & Definitions (16)

Lemma 1
proof
Lemma 2
proof
Lemma 3
proof
Theorem 1
proof
Lemma 4
Lemma 5
...and 6 more

An Efficient Stochastic First-Order Algorithm for Nonconvex-Strongly Concave Minimax Optimization beyond Lipschitz Smoothness

TL;DR

Abstract

An Efficient Stochastic First-Order Algorithm for Nonconvex-Strongly Concave Minimax Optimization beyond Lipschitz Smoothness

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (16)