Table of Contents
Fetching ...

First-order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems

Mingrui Liu, Hassan Rafique, Qihang Lin, Tianbao Yang

TL;DR

This work develops the first non-asymptotic, first-order convergence theory for solving weakly convex-weakly concave min-max problems by recasting them as variational inequalities. It introduces an inexact proximal point framework that solves a sequence of strongly monotone SVIs to obtain a nearly stationary solution, with iteration complexities varying by the subroutine (SGD, GD, EG, VR). Theoretical results are complemented by experiments on synthetic problems and GAN training (WGAN/WGAN-GP) demonstrating practical convergence and performance benefits. The framework unifies nonconvex-nonconvex min-max analysis under VI theory and extends to weakly monotone SVIs.

Abstract

In this paper, we consider first-order convergence theory and algorithms for solving a class of non-convex non-concave min-max saddle-point problems, whose objective function is weakly convex in the variables of minimization and weakly concave in the variables of maximization. It has many important applications in machine learning including training Generative Adversarial Nets (GANs). We propose an algorithmic framework motivated by the inexact proximal point method, where the weakly monotone variational inequality (VI) corresponding to the original min-max problem is solved through approximately solving a sequence of strongly monotone VIs constructed by adding a strongly monotone mapping to the original gradient mapping. We prove first-order convergence to a nearly stationary solution of the original min-max problem of the generic algorithmic framework and establish different rates by employing different algorithms for solving each strongly monotone VI. Experiments verify the convergence theory and also demonstrate the effectiveness of the proposed methods on training GANs.

First-order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems

TL;DR

This work develops the first non-asymptotic, first-order convergence theory for solving weakly convex-weakly concave min-max problems by recasting them as variational inequalities. It introduces an inexact proximal point framework that solves a sequence of strongly monotone SVIs to obtain a nearly stationary solution, with iteration complexities varying by the subroutine (SGD, GD, EG, VR). Theoretical results are complemented by experiments on synthetic problems and GAN training (WGAN/WGAN-GP) demonstrating practical convergence and performance benefits. The framework unifies nonconvex-nonconvex min-max analysis under VI theory and extends to weakly monotone SVIs.

Abstract

In this paper, we consider first-order convergence theory and algorithms for solving a class of non-convex non-concave min-max saddle-point problems, whose objective function is weakly convex in the variables of minimization and weakly concave in the variables of maximization. It has many important applications in machine learning including training Generative Adversarial Nets (GANs). We propose an algorithmic framework motivated by the inexact proximal point method, where the weakly monotone variational inequality (VI) corresponding to the original min-max problem is solved through approximately solving a sequence of strongly monotone VIs constructed by adding a strongly monotone mapping to the original gradient mapping. We prove first-order convergence to a nearly stationary solution of the original min-max problem of the generic algorithmic framework and establish different rates by employing different algorithms for solving each strongly monotone VI. Experiments verify the convergence theory and also demonstrate the effectiveness of the proposed methods on training GANs.

Paper Structure

This paper contains 25 sections, 19 theorems, 97 equations, 2 figures, 3 tables, 6 algorithms.

Key Result

Lemma 3

$f({\mathbf x},{\mathbf y})$ is $\rho$-weakly-convex-weakly-concave if and only if $F({\mathbf z})$ is $\rho$-weakly monotone.

Figures (2)

  • Figure 1: Comparison of different methods for solving WGAN and WGAN-GP.
  • Figure : GD$(F,{\mathcal{Z}},{\mathbf z}^{(0)},\eta,T)$

Theorems & Definitions (23)

  • Definition 1
  • Definition 2
  • Lemma 3
  • Lemma 4
  • Definition 5
  • Lemma 6
  • Theorem 7
  • Proposition 8
  • Corollary 9
  • Lemma 10
  • ...and 13 more