Shuffling Gradient Descent-Ascent with Variance Reduction for Nonconvex-Strongly Concave Smooth Minimax Problems
Xia Jiang, Linglingzhi Zhu, Anthony Man-Cho So, Shisheng Cui, Jian Sun
TL;DR
This paper proposes a novel single-loop stochastic gradient descent-ascent (GDA) algorithm that employs both shuffling schemes and variance reduction to solve nonconvex-strongly concave smooth minimax problems and achieves $\epsilon$-stationarity in expectation in $\mathcal{O}(\kappa^2 \epsilon^{-2})$ iterations.
Abstract
In recent years, there has been considerable interest in designing stochastic first-order algorithms to tackle finite-sum smooth minimax problems. To obtain the gradient estimates, one typically relies on the uniform sampling-with-replacement scheme or various sampling-without-replacement (also known as shuffling) schemes. While the former is easier to analyze, the latter often have better empirical performance. In this paper, we propose a novel single-loop stochastic gradient descent-ascent (GDA) algorithm that employs both shuffling schemes and variance reduction to solve nonconvex-strongly concave smooth minimax problems. We show that the proposed algorithm achieves $ε$-stationarity in expectation in $\mathcal{O}(κ^2 ε^{-2})$ iterations, where $κ$ is the condition number of the problem. This outperforms existing shuffling schemes and matches the complexity of the best-known sampling-with-replacement algorithms. Our proposed algorithm also achieves the same complexity as that of its deterministic counterpart, the two-timescale GDA algorithm. Our numerical experiments demonstrate the superior performance of the proposed algorithm.
