Table of Contents
Fetching ...

Efficient Single-Loop Stochastic Algorithms for Nonconvex-Concave Minimax Optimization

Xia Jiang, Linglingzhi Zhu, Taoli Zheng, Anthony Man-Cho So

TL;DR

This work tackles nonconvex-concave finite-sum minimax problems by introducing two single-loop variance-reduced stochastic gradient methods. The PVR-SGDA algorithm uses a probabilistic full-gradient update together with Moreau-Yosida smoothing to achieve $\mathcal{O}(\epsilon^{-4})$ iteration complexity, improving over existing stochastic rates. To eliminate full gradient computations, the ZeroSARAH-SGDA variant employs auxiliary gradient trackers and retains a comparable $\mathcal{O}(\epsilon^{-4})$ iteration complexity with $\mathcal{O}(\sqrt{n}\epsilon^{-4})$ gradient calls, at the cost of extra memory. Numerical results on robust logistic regression and data poisoning demonstrate faster convergence and favorable gradient-efficiency trade-offs, validating the practical effectiveness of the proposed methods for large-scale NC-C minimax problems.

Abstract

Nonconvex-concave (NC-C) finite-sum minimax problems have wide applications in signal processing and machine learning tasks. Conventional stochastic gradient algorithms, which rely on uniform sampling for gradient estimation, often suffer from slow convergence rates and require bounded variance assumptions. While variance reduction techniques can significantly improve the convergence of stochastic algorithms, the inherent nonsmooth nature of NC-C problems makes it challenging to design effective variance reduction techniques. To address this challenge, we develop a novel probabilistic variance reduction scheme and propose a single-loop stochastic gradient algorithm called the probabilistic variance-reduced smoothed gradient descent-ascent (PVR-SGDA) algorithm. The proposed PVR-SGDA algorithm achieves an iteration complexity of $\mathcal{O}(ε^{-4})$, surpassing the best-known rates of stochastic algorithms for NC-C minimax problems and matching the performance of state-of-the-art deterministic algorithms. Furthermore, to completely eliminate the need for full gradient computation and reduce the gradient complexity, we explore another variance reduction technique with auxiliary gradient trackers and propose a smoothed gradient descent-ascent algorithm without full gradient calculation, called ZeroSARAH-SGDA, for NC-C problems. The ZeroSARAH-SGDA algorithm achieves a comparable iteration complexity to PVR-SGDA, while reducing the gradient oracle calls at each iteration. Finally, we demonstrate the effectiveness of the proposed two algorithms through numerical simulations.

Efficient Single-Loop Stochastic Algorithms for Nonconvex-Concave Minimax Optimization

TL;DR

This work tackles nonconvex-concave finite-sum minimax problems by introducing two single-loop variance-reduced stochastic gradient methods. The PVR-SGDA algorithm uses a probabilistic full-gradient update together with Moreau-Yosida smoothing to achieve iteration complexity, improving over existing stochastic rates. To eliminate full gradient computations, the ZeroSARAH-SGDA variant employs auxiliary gradient trackers and retains a comparable iteration complexity with gradient calls, at the cost of extra memory. Numerical results on robust logistic regression and data poisoning demonstrate faster convergence and favorable gradient-efficiency trade-offs, validating the practical effectiveness of the proposed methods for large-scale NC-C minimax problems.

Abstract

Nonconvex-concave (NC-C) finite-sum minimax problems have wide applications in signal processing and machine learning tasks. Conventional stochastic gradient algorithms, which rely on uniform sampling for gradient estimation, often suffer from slow convergence rates and require bounded variance assumptions. While variance reduction techniques can significantly improve the convergence of stochastic algorithms, the inherent nonsmooth nature of NC-C problems makes it challenging to design effective variance reduction techniques. To address this challenge, we develop a novel probabilistic variance reduction scheme and propose a single-loop stochastic gradient algorithm called the probabilistic variance-reduced smoothed gradient descent-ascent (PVR-SGDA) algorithm. The proposed PVR-SGDA algorithm achieves an iteration complexity of , surpassing the best-known rates of stochastic algorithms for NC-C minimax problems and matching the performance of state-of-the-art deterministic algorithms. Furthermore, to completely eliminate the need for full gradient computation and reduce the gradient complexity, we explore another variance reduction technique with auxiliary gradient trackers and propose a smoothed gradient descent-ascent algorithm without full gradient calculation, called ZeroSARAH-SGDA, for NC-C problems. The ZeroSARAH-SGDA algorithm achieves a comparable iteration complexity to PVR-SGDA, while reducing the gradient oracle calls at each iteration. Finally, we demonstrate the effectiveness of the proposed two algorithms through numerical simulations.
Paper Structure (20 sections, 16 theorems, 59 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 20 sections, 16 theorems, 59 equations, 2 figures, 1 table, 2 algorithms.

Key Result

Lemma 3.1

The function $K(\cdot,z;y)$ is strongly convex with $r-L$ and $\nabla_x K(\cdot,z;y)$ is Lipschitz continuous with constant $L+r$.

Figures (2)

  • Figure 1: (a) Convergence of PVR-SGDA algorithm with different $p$. (b) Performance for different algorithms.
  • Figure 2: Testing accuracy with respect to gradient oracle calls in data poisoning.

Theorems & Definitions (31)

  • Remark 3.1
  • Lemma 3.1
  • Lemma 4.1
  • Lemma 4.2
  • Proposition 4.1
  • Lemma 4.3: c.f. li2023nonsmooth
  • Definition 4.1
  • Lemma 4.4
  • Theorem 4.1
  • proof
  • ...and 21 more