An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

Lesi Chen; Haishan Ye; Luo Luo

An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

Lesi Chen, Haishan Ye, Luo Luo

TL;DR

This work tackles decentralized stochastic minimax optimization with nonconvexity in $x$ and strong concavity in $y$ across a network of $m$ agents. It introduces DREAM, a stochastic recursive-gradient method with gradient tracking and a novel Lyapunov function that unifies online and offline analyses and accommodates constrained $\mathcal{Y}$. The main results establish optimal or near-optimal computation and communication complexities: $O\bigl(\kappa^2\sigma^2\epsilon^{-2} + \kappa^3 L \sigma \epsilon^{-3}\bigr)$ SFOs online and $O(mn + \sqrt{mn}\kappa^2 L\epsilon^{-2})$ SFOs offline, with communication $O\left(\dfrac{\kappa^2 L \epsilon^{-2}\log m}{\sqrt{\delta}}\right)$. DREAM also achieves a linear speed-up with the number of agents and outperforms prior decentralized minimax methods both theoretically and empirically on robust logistic regression tasks.

Abstract

This paper studies the stochastic nonconvex-strongly-concave minimax optimization over a multi-agent network. We propose an efficient algorithm, called Decentralized Recursive gradient descEnt Ascent Method (DREAM), which achieves the best-known theoretical guarantee for finding the $ε$-stationary points. Concretely, it requires $\mathcal{O}(\min (κ^3ε^{-3},κ^2 \sqrt{N} ε^{-2} ))$ stochastic first-order oracle (SFO) calls and $\tilde{\mathcal{O}}(κ^2 ε^{-2})$ communication rounds, where $κ$ is the condition number and $N$ is the total number of individual functions. Our numerical experiments also validate the superiority of DREAM over previous methods.

An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

TL;DR

This work tackles decentralized stochastic minimax optimization with nonconvexity in

and strong concavity in

across a network of

agents. It introduces DREAM, a stochastic recursive-gradient method with gradient tracking and a novel Lyapunov function that unifies online and offline analyses and accommodates constrained

. The main results establish optimal or near-optimal computation and communication complexities:

SFOs online and

SFOs offline, with communication

. DREAM also achieves a linear speed-up with the number of agents and outperforms prior decentralized minimax methods both theoretically and empirically on robust logistic regression tasks.

Abstract

-stationary points. Concretely, it requires

stochastic first-order oracle (SFO) calls and

communication rounds, where

is the condition number and

is the total number of individual functions. Our numerical experiments also validate the superiority of DREAM over previous methods.

Paper Structure (17 sections, 14 theorems, 89 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 14 theorems, 89 equations, 2 figures, 1 table, 1 algorithm.

Introduction
Notations.
Assumptions and Preliminaries
The Proposed Algorithm
Method Overview
A Novel Lyapunov Function
Convergence Analysis
Experiments
Conclusion and Future Work
Some Useful Lemmas
Discussions on DM-HSGD
The Proof of Lemma \ref{['lem:Lyp-P']}
The Proof of Lemma \ref{['lem:Lyp-C']}
The Proof of Lemma \ref{['lem:Lyp-V']}
The Proof of Lemma \ref{['lem:Lyp-U']}
...and 2 more sections

Key Result

Proposition 2.1

Under Assumptions asm:smooth and asm:SC, the function $P(x)$ is $L_{P}$-smooth with $L_P \triangleq (\kappa+1)L$ and its gradient can be written as $\nabla P(x) = \nabla_x f(x,y^*(x))$, where we define $y^*(x) \triangleq \arg \max_{y \in {\mathcal{Y}}} f(x,y)$.

Figures (2)

Figure 1: Comparison on the number of SFO calls against $P(\bar{x}) \triangleq \max_{y \in \Delta_N} f(\bar{x},y)$.
Figure 2: Comparison on the number of communication rounds against $P(\bar{x}) \triangleq \max_{y \in \Delta_N} f(\bar{x},y)$.

Theorems & Definitions (28)

Definition 2.1
Proposition 2.1
Definition 2.2
Definition 2.3
Proposition 2.2: koloskova2019decentralized
Proposition 2.3: ye2020multi
Definition 2.4
Definition 2.5
Remark 3.1
Lemma 3.1
...and 18 more

An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

TL;DR

Abstract

An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (28)