Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization
Ruijia Zhang, Mingxi Lei, Meng Ding, Zihang Xiang, Jinhui Xu, Di Wang
TL;DR
This work tackles differential privacy in finite-sum minimax optimization under the nonconvex-strongly-concave regime, a setting relevant to deep learning models like deep AUC maximization. It first analyzes a DP-SGDA baseline and demonstrates a gradient-norm utility bound of $\tilde{O}(\frac{d^{1/4}}{(n\epsilon)^{1/2}})$, then introduces PrivateDiff Minimax, a variance-reduction approach that leverages gradient differences and restart schemes to achieve the improved rate $\tilde{O}(\frac{d^{1/3}}{(n\epsilon)^{2/3}})$, matching the best-known DP-ERM results for non-convex loss. The paper provides lower bounds for private minimax and validates the theory with experiments on AUC maximization, GANs, and temporal-difference learning, showing that PrivateDiff Minimax consistently outperforms DP-SGDA under various privacy budgets. Overall, these results advance private training for nonconvex minimax problems and suggest practical pathways for privacy-preserving deep learning tasks.
Abstract
In this paper, we study the problem of (finite sum) minimax optimization in the Differential Privacy (DP) model. Unlike most of the previous studies on the (strongly) convex-concave settings or loss functions satisfying the Polyak-Lojasiewicz condition, here we mainly focus on the nonconvex-strongly-concave one, which encapsulates many models in deep learning such as deep AUC maximization. Specifically, we first analyze a DP version of Stochastic Gradient Descent Ascent (SGDA) and show that it is possible to get a DP estimator whose $l_2$-norm of the gradient for the empirical risk function is upper bounded by $\tilde{O}(\frac{d^{1/4}}{({nε})^{1/2}})$, where $d$ is the model dimension and $n$ is the sample size. We then propose a new method with less gradient noise variance and improve the upper bound to $\tilde{O}(\frac{d^{1/3}}{(nε)^{2/3}})$, which matches the best-known result for DP Empirical Risk Minimization with non-convex loss. We also discussed several lower bounds of private minimax optimization. Finally, experiments on AUC maximization, generative adversarial networks, and temporal difference learning with real-world data support our theoretical analysis.
