A Stochastic GDA Method With Backtracking For Solving Nonconvex (Strongly) Concave Minimax Problems
Qiushui Xu, Xuan Zhang, Necdet Serhat Aybat, Mert Gürbüzbalaban
TL;DR
To the knowledge, SGDA-B is the first GDA-type method with backtracking to solve NCC minimax problems and achieves the best complexity among the methods that are agnostic to $L$.
Abstract
We propose a stochastic GDA (gradient descent ascent) method with backtracking (SGDA-B) to solve nonconvex-(strongly) concave (NCC) minimax problems $\min_x \max_y \sum_{i=1}^N g_i(x_i)+f(x,y)-h(y)$, where $h$ and $g_i$ for $i = 1, \ldots, N$ are closed, convex functions, $f$ is $L$-smooth and $μ$-strongly concave in $y$ for some $μ\geq 0$. We consider two scenarios: (i) the deterministic setting where we assume one can compute $\nabla f$ exactly, and (ii) the stochastic setting where we have only access to $\nabla f$ through an unbiased stochastic oracle with a finite variance. While most of the existing methods assume knowledge of the Lipschitz constant $L$, SGDA-B is agnostic to $L$. Moreover, SGDA-B can support random block-coordinate updates. In the deterministic setting, SGDA-B can compute an $ε$-stationary point within $\mathcal{O}(Lκ^2/ε^2)$ and $\mathcal{O}(L^3/ε^4)$ gradient calls when $μ>0$ and $μ=0$, respectively, where $κ=L/μ$. In the stochastic setting, for any $p \in (0, 1)$ and $ε>0$, it can compute an $ε$-stationary point with high probability, which requires $\mathcal{O}(Lκ^3ε^{-4}\log(1/p))$ and $\tilde{\mathcal{O}}(L^4ε^{-7}\log(1/p))$ stochastic oracle calls, with probability at least $1-p$, when $μ>0$ and $μ=0$, respectively. To our knowledge, SGDA-B is the first GDA-type method with backtracking to solve NCC minimax problems and achieves the best complexity among the methods that are agnostic to $L$. We also provide numerical results for SGDA-B on a distributionally robust learning problem illustrating the potential performance gains that can be achieved by SGDA-B.
