Table of Contents
Fetching ...

Differentially Private Non-convex Distributionally Robust Optimization

Difei Xu, Meng Ding, Zebin Ma, Huanyi Xie, Youming Tao, Aicha Slaitane, Di Wang

TL;DR

A comprehensive study of DP-(finite-sum)-DRO with $\psi$-divergence and non-convex loss, and a novel DP Double-Spider optimization method, called DP Double-Spider, tailored to this structure, which achieves a utility bound matching the best-known result for non-convex DP-ERM.

Abstract

Real-world deployments routinely face distribution shifts, group imbalances, and adversarial perturbations, under which the traditional Empirical Risk Minimization (ERM) framework can degrade severely. Distributionally Robust Optimization (DRO) addresses this issue by optimizing the worst-case expected loss over an uncertainty set of distributions, offering a principled approach to robustness. Meanwhile, as training data in DRO always involves sensitive information, safeguarding it against leakage under Differential Privacy (DP) is essential. In contrast to classical DP-ERM, DP-DRO has received much less attention due to its minimax optimization structure with uncertainty constraint. To bridge the gap, we provide a comprehensive study of DP-(finite-sum)-DRO with $ψ$-divergence and non-convex loss. First, we study DRO with general $ψ$-divergence by reformulating it as a minimization problem, and develop a novel $(\varepsilon, δ)$-DP optimization method, called DP Double-Spider, tailored to this structure. Under mild assumptions, we show that it achieves a utility bound of $\mathcal{O}(\frac{1}{\sqrt{n}}+ (\frac{\sqrt{d \log (1/δ)}}{n \varepsilon})^{2/3})$ in terms of the gradient norm, where $n$ denotes the data size and $d$ denotes the model dimension. We further improve the utility rate for specific divergences. In particular, for DP-DRO with KL-divergence, by transforming the problem into a compositional finite-sum optimization problem, we develop a DP Recursive-Spider method and show that it achieves a utility bound of $\mathcal{O}((\frac{\sqrt{d \log(1/δ)}}{n\varepsilon})^{2/3} )$, matching the best-known result for non-convex DP-ERM. Experimentally, we demonstrate that our proposed methods outperform existing approaches for DP minimax optimization.

Differentially Private Non-convex Distributionally Robust Optimization

TL;DR

A comprehensive study of DP-(finite-sum)-DRO with -divergence and non-convex loss, and a novel DP Double-Spider optimization method, called DP Double-Spider, tailored to this structure, which achieves a utility bound matching the best-known result for non-convex DP-ERM.

Abstract

Real-world deployments routinely face distribution shifts, group imbalances, and adversarial perturbations, under which the traditional Empirical Risk Minimization (ERM) framework can degrade severely. Distributionally Robust Optimization (DRO) addresses this issue by optimizing the worst-case expected loss over an uncertainty set of distributions, offering a principled approach to robustness. Meanwhile, as training data in DRO always involves sensitive information, safeguarding it against leakage under Differential Privacy (DP) is essential. In contrast to classical DP-ERM, DP-DRO has received much less attention due to its minimax optimization structure with uncertainty constraint. To bridge the gap, we provide a comprehensive study of DP-(finite-sum)-DRO with -divergence and non-convex loss. First, we study DRO with general -divergence by reformulating it as a minimization problem, and develop a novel -DP optimization method, called DP Double-Spider, tailored to this structure. Under mild assumptions, we show that it achieves a utility bound of in terms of the gradient norm, where denotes the data size and denotes the model dimension. We further improve the utility rate for specific divergences. In particular, for DP-DRO with KL-divergence, by transforming the problem into a compositional finite-sum optimization problem, we develop a DP Recursive-Spider method and show that it achieves a utility bound of , matching the best-known result for non-convex DP-ERM. Experimentally, we demonstrate that our proposed methods outperform existing approaches for DP minimax optimization.
Paper Structure (20 sections, 18 theorems, 78 equations, 3 figures, 3 tables, 3 algorithms)

This paper contains 20 sections, 18 theorems, 78 equations, 3 figures, 3 tables, 3 algorithms.

Key Result

Theorem 1

For any $\varepsilon>0$ and $\delta\in (0, 1)$, let $\sigma_1=\mathcal{O}(\frac{ C_1 \sqrt{T \log (1/\delta)}}{n\sqrt{q}\varepsilon})$, $\sigma_2=\mathcal{O}(\frac{C_2\sqrt{\log (1/\delta)}}{N_2 \varepsilon})$. Similarly, set $\sigma_3=\mathcal{O}(\frac{C_3 \sqrt{T\log (1/\delta)}}{n \sqrt{q}\vareps

Figures (3)

  • Figure 1: Experimental Results: The performances of four algorithms on CIFAR10-ST, CelebA, Fashion-MNIST, MNIST-ST respectively
  • Figure 2: Test AUC Results: The performances of four algorithms on CIFAR10-ST, CelebA, Fashion-MNIST, MNIST-ST respectively
  • Figure 3: Test F1 Score Results: The performances of four algorithms on CIFAR10-ST, CelebA, Fashion-MNIST, MNIST-ST respectively

Theorems & Definitions (38)

  • Definition 1: Differential Privacy dwork2006calibrating
  • Definition 2
  • Definition 3
  • Definition 4: DP-DRO
  • Definition 5
  • Definition 6
  • Definition 7: Generalized $(L_0,L_1)$-smooth
  • Definition 8
  • Theorem 1
  • Theorem 2
  • ...and 28 more