Table of Contents
Fetching ...

An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

Lesi Chen, Haishan Ye, Luo Luo

TL;DR

This work tackles decentralized stochastic minimax optimization with nonconvexity in $x$ and strong concavity in $y$ across a network of $m$ agents. It introduces DREAM, a stochastic recursive-gradient method with gradient tracking and a novel Lyapunov function that unifies online and offline analyses and accommodates constrained $\mathcal{Y}$. The main results establish optimal or near-optimal computation and communication complexities: $O\bigl(\kappa^2\sigma^2\epsilon^{-2} + \kappa^3 L \sigma \epsilon^{-3}\bigr)$ SFOs online and $O(mn + \sqrt{mn}\kappa^2 L\epsilon^{-2})$ SFOs offline, with communication $O\left(\dfrac{\kappa^2 L \epsilon^{-2}\log m}{\sqrt{\delta}}\right)$. DREAM also achieves a linear speed-up with the number of agents and outperforms prior decentralized minimax methods both theoretically and empirically on robust logistic regression tasks.

Abstract

This paper studies the stochastic nonconvex-strongly-concave minimax optimization over a multi-agent network. We propose an efficient algorithm, called Decentralized Recursive gradient descEnt Ascent Method (DREAM), which achieves the best-known theoretical guarantee for finding the $ε$-stationary points. Concretely, it requires $\mathcal{O}(\min (κ^3ε^{-3},κ^2 \sqrt{N} ε^{-2} ))$ stochastic first-order oracle (SFO) calls and $\tilde{\mathcal{O}}(κ^2 ε^{-2})$ communication rounds, where $κ$ is the condition number and $N$ is the total number of individual functions. Our numerical experiments also validate the superiority of DREAM over previous methods.

An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

TL;DR

This work tackles decentralized stochastic minimax optimization with nonconvexity in and strong concavity in across a network of agents. It introduces DREAM, a stochastic recursive-gradient method with gradient tracking and a novel Lyapunov function that unifies online and offline analyses and accommodates constrained . The main results establish optimal or near-optimal computation and communication complexities: SFOs online and SFOs offline, with communication . DREAM also achieves a linear speed-up with the number of agents and outperforms prior decentralized minimax methods both theoretically and empirically on robust logistic regression tasks.

Abstract

This paper studies the stochastic nonconvex-strongly-concave minimax optimization over a multi-agent network. We propose an efficient algorithm, called Decentralized Recursive gradient descEnt Ascent Method (DREAM), which achieves the best-known theoretical guarantee for finding the -stationary points. Concretely, it requires stochastic first-order oracle (SFO) calls and communication rounds, where is the condition number and is the total number of individual functions. Our numerical experiments also validate the superiority of DREAM over previous methods.
Paper Structure (17 sections, 14 theorems, 89 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 14 theorems, 89 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Proposition 2.1

Under Assumptions asm:smooth and asm:SC, the function $P(x)$ is $L_{P}$-smooth with $L_P \triangleq (\kappa+1)L$ and its gradient can be written as $\nabla P(x) = \nabla_x f(x,y^*(x))$, where we define $y^*(x) \triangleq \arg \max_{y \in {\mathcal{Y}}} f(x,y)$.

Figures (2)

  • Figure 1: Comparison on the number of SFO calls against $P(\bar{x}) \triangleq \max_{y \in \Delta_N} f(\bar{x},y)$.
  • Figure 2: Comparison on the number of communication rounds against $P(\bar{x}) \triangleq \max_{y \in \Delta_N} f(\bar{x},y)$.

Theorems & Definitions (28)

  • Definition 2.1
  • Proposition 2.1
  • Definition 2.2
  • Definition 2.3
  • Proposition 2.2: koloskova2019decentralized
  • Proposition 2.3: ye2020multi
  • Definition 2.4
  • Definition 2.5
  • Remark 3.1
  • Lemma 3.1
  • ...and 18 more