Table of Contents
Fetching ...

Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization

Ruijia Zhang, Mingxi Lei, Meng Ding, Zihang Xiang, Jinhui Xu, Di Wang

TL;DR

This work tackles differential privacy in finite-sum minimax optimization under the nonconvex-strongly-concave regime, a setting relevant to deep learning models like deep AUC maximization. It first analyzes a DP-SGDA baseline and demonstrates a gradient-norm utility bound of $\tilde{O}(\frac{d^{1/4}}{(n\epsilon)^{1/2}})$, then introduces PrivateDiff Minimax, a variance-reduction approach that leverages gradient differences and restart schemes to achieve the improved rate $\tilde{O}(\frac{d^{1/3}}{(n\epsilon)^{2/3}})$, matching the best-known DP-ERM results for non-convex loss. The paper provides lower bounds for private minimax and validates the theory with experiments on AUC maximization, GANs, and temporal-difference learning, showing that PrivateDiff Minimax consistently outperforms DP-SGDA under various privacy budgets. Overall, these results advance private training for nonconvex minimax problems and suggest practical pathways for privacy-preserving deep learning tasks.

Abstract

In this paper, we study the problem of (finite sum) minimax optimization in the Differential Privacy (DP) model. Unlike most of the previous studies on the (strongly) convex-concave settings or loss functions satisfying the Polyak-Lojasiewicz condition, here we mainly focus on the nonconvex-strongly-concave one, which encapsulates many models in deep learning such as deep AUC maximization. Specifically, we first analyze a DP version of Stochastic Gradient Descent Ascent (SGDA) and show that it is possible to get a DP estimator whose $l_2$-norm of the gradient for the empirical risk function is upper bounded by $\tilde{O}(\frac{d^{1/4}}{({nε})^{1/2}})$, where $d$ is the model dimension and $n$ is the sample size. We then propose a new method with less gradient noise variance and improve the upper bound to $\tilde{O}(\frac{d^{1/3}}{(nε)^{2/3}})$, which matches the best-known result for DP Empirical Risk Minimization with non-convex loss. We also discussed several lower bounds of private minimax optimization. Finally, experiments on AUC maximization, generative adversarial networks, and temporal difference learning with real-world data support our theoretical analysis.

Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization

TL;DR

This work tackles differential privacy in finite-sum minimax optimization under the nonconvex-strongly-concave regime, a setting relevant to deep learning models like deep AUC maximization. It first analyzes a DP-SGDA baseline and demonstrates a gradient-norm utility bound of , then introduces PrivateDiff Minimax, a variance-reduction approach that leverages gradient differences and restart schemes to achieve the improved rate , matching the best-known DP-ERM results for non-convex loss. The paper provides lower bounds for private minimax and validates the theory with experiments on AUC maximization, GANs, and temporal-difference learning, showing that PrivateDiff Minimax consistently outperforms DP-SGDA under various privacy budgets. Overall, these results advance private training for nonconvex minimax problems and suggest practical pathways for privacy-preserving deep learning tasks.

Abstract

In this paper, we study the problem of (finite sum) minimax optimization in the Differential Privacy (DP) model. Unlike most of the previous studies on the (strongly) convex-concave settings or loss functions satisfying the Polyak-Lojasiewicz condition, here we mainly focus on the nonconvex-strongly-concave one, which encapsulates many models in deep learning such as deep AUC maximization. Specifically, we first analyze a DP version of Stochastic Gradient Descent Ascent (SGDA) and show that it is possible to get a DP estimator whose -norm of the gradient for the empirical risk function is upper bounded by , where is the model dimension and is the sample size. We then propose a new method with less gradient noise variance and improve the upper bound to , which matches the best-known result for DP Empirical Risk Minimization with non-convex loss. We also discussed several lower bounds of private minimax optimization. Finally, experiments on AUC maximization, generative adversarial networks, and temporal difference learning with real-world data support our theoretical analysis.

Paper Structure

This paper contains 30 sections, 22 theorems, 101 equations, 8 figures, 5 tables, 3 algorithms.

Key Result

Lemma 1

abadi2016deep Consider a sequence of mechanisms $\{\mathcal{A}_t\}_{t \in[T]}$ and the composite mechanism $\mathcal{A}=(\mathcal{A}_1, \cdots, \mathcal{A}_T)$. We have the following properties: (a) [Composability] For any $\lambda$, (b) [Tail bound] For any $\epsilon$, the mechanism $\mathcal{A}$ is $(\epsilon, \delta)$ differentially private for

Figures (8)

  • Figure 1: Comparison of Gradient Norm, Gradient Variance, and AUC Performance between DP-SGDA and PrivateDiff.
  • Figure 2: Comparison of AUC performance in DP-SGDA and PrivateDiff Minimax on MNIST dataset.
  • Figure 3: Comparison of AUC performance in DP-SGDA and PrivateDiff Minimax on Fashion-MNIST dataset.
  • Figure 4: Non-private Performance across Different Dataset.
  • Figure 5: Impact of Privacy Budget for PrivateDiff Algorithm across Different Dataset
  • ...and 3 more figures

Theorems & Definitions (47)

  • Definition 1: Differential Privacy dwork2006calibrating
  • Definition 2: $l_2$-sensitivity
  • Definition 3: Gaussian Mechanism
  • Definition 4
  • Lemma 1
  • Lemma 2: Privacy Amplification via Subsampling balle2018privacy
  • Definition 5
  • Definition 6
  • Definition 7
  • Definition 8
  • ...and 37 more