Table of Contents
Fetching ...

An Improved Privacy and Utility Analysis of Differentially Private SGD with Bounded Domain and Smooth Losses

Hao Liang, Wanrong Zhang, Xinlei He, Kaishun Wu, Hong Xing

TL;DR

The paper tackles the challenge of quantifying privacy loss in differentially private SGD without heavily restrictive assumptions. It introduces a Noisy Smooth-Reduction framework and shift Rényi divergence to derive closed-form $(\alpha,\varepsilon)$-RDP bounds for DPSGD-GC and DPSGD-DC under $L$-smooth, non-convex losses, in both unbounded and bounded domains. The authors also provide accompanying utility analyses and show that a smaller bounded domain diameter $D$ improves both privacy and utility under certain conditions, with concrete Big-O bounds and mu-strongly convex case results. Empirical validation via membership inference attacks confirms the theoretical insights and demonstrates practical privacy-utility trade-offs across batch sizes and domain diameters. Overall, the work advances rigorous, convergent privacy analysis for private optimization and guides design choices in privacy-utility trade-offs for DPSGD variants.

Abstract

Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to protect sensitive data during the training of machine learning models, but its privacy guarantee often comes at a large cost of model performance due to the lack of tight theoretical bounds quantifying privacy loss. While recent efforts have achieved more accurate privacy guarantees, they still impose some assumptions prohibited from practical applications, such as convexity and complex parameter requirements, and rarely investigate in-depth the impact of privacy mechanisms on the model's utility. In this paper, we provide a rigorous privacy characterization for DPSGD with general L-smooth and non-convex loss functions, revealing converged privacy loss with iteration in bounded-domain cases. Specifically, we track the privacy loss over multiple iterations, leveraging the noisy smooth-reduction property, and further establish comprehensive convergence analysis in different scenarios. In particular, we show that for DPSGD with a bounded domain, (i) the privacy loss can still converge without the convexity assumption, (ii) a smaller bounded diameter can improve both privacy and utility simultaneously under certain conditions, and (iii) the attainable big-O order of the privacy utility trade-off for DPSGD with gradient clipping (DPSGD-GC) and for DPSGD-GC with bounded domain (DPSGD-DC) and mu-strongly convex population risk function, respectively. Experiments via membership inference attack (MIA) in a practical setting validate insights gained from the theoretical results.

An Improved Privacy and Utility Analysis of Differentially Private SGD with Bounded Domain and Smooth Losses

TL;DR

The paper tackles the challenge of quantifying privacy loss in differentially private SGD without heavily restrictive assumptions. It introduces a Noisy Smooth-Reduction framework and shift Rényi divergence to derive closed-form -RDP bounds for DPSGD-GC and DPSGD-DC under -smooth, non-convex losses, in both unbounded and bounded domains. The authors also provide accompanying utility analyses and show that a smaller bounded domain diameter improves both privacy and utility under certain conditions, with concrete Big-O bounds and mu-strongly convex case results. Empirical validation via membership inference attacks confirms the theoretical insights and demonstrates practical privacy-utility trade-offs across batch sizes and domain diameters. Overall, the work advances rigorous, convergent privacy analysis for private optimization and guides design choices in privacy-utility trade-offs for DPSGD variants.

Abstract

Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to protect sensitive data during the training of machine learning models, but its privacy guarantee often comes at a large cost of model performance due to the lack of tight theoretical bounds quantifying privacy loss. While recent efforts have achieved more accurate privacy guarantees, they still impose some assumptions prohibited from practical applications, such as convexity and complex parameter requirements, and rarely investigate in-depth the impact of privacy mechanisms on the model's utility. In this paper, we provide a rigorous privacy characterization for DPSGD with general L-smooth and non-convex loss functions, revealing converged privacy loss with iteration in bounded-domain cases. Specifically, we track the privacy loss over multiple iterations, leveraging the noisy smooth-reduction property, and further establish comprehensive convergence analysis in different scenarios. In particular, we show that for DPSGD with a bounded domain, (i) the privacy loss can still converge without the convexity assumption, (ii) a smaller bounded diameter can improve both privacy and utility simultaneously under certain conditions, and (iii) the attainable big-O order of the privacy utility trade-off for DPSGD with gradient clipping (DPSGD-GC) and for DPSGD-GC with bounded domain (DPSGD-DC) and mu-strongly convex population risk function, respectively. Experiments via membership inference attack (MIA) in a practical setting validate insights gained from the theoretical results.

Paper Structure

This paper contains 38 sections, 27 theorems, 105 equations, 5 figures, 1 table, 1 algorithm.

Key Result

Lemma 2.4

(From $(\alpha,\varepsilon)$-RDP to $(\epsilon,\delta)$-DP mironov2017renyi). If $\mathcal{M}$ is an $(\alpha,\varepsilon)$-RDP mechanism, it is also $(\varepsilon+\frac{\log1/\delta}{\alpha-1},\delta)$-DP for any $0<\delta<1$.

Figures (5)

  • Figure 1: Comparison of our theoretical $(\alpha, \varepsilon)$-RDP bound for DPSGD-DC with existing approaches. Detailed assumptions required by each method have been summarized in Table \ref{['table:compare']}.
  • Figure 2: The evolution of the: (a) estimated and (b) normalized theoretical privacy level during DPSGD-GC with different batch sizes. The shaded error bars correspond to intervals covering $95$% of the realized values, obtained from the $10$ Monte Carlo trials. Note that the privacy bounds in terms of the number of epochs, $E$, can be derived by substituting $T=\lceil \frac{n}{b}\rceil E$ into our main results.
  • Figure 3: The evolution of the privacy level during DPSGD-DC with different diameters of the bounded domain: (a) estimated by MIA and (b) normalized theoretical privacy level $\varepsilon$ with $\alpha=1.1$. The red and blue lines correspond to the cases with batch sizes of $100$ and $400$, respectively.
  • Figure 4: The evolution of the training loss during DPSGD-GC with different batch sizes. The shaded error bars correspond to intervals covering $95$% of the realized values, obtained from the $10$ Monte Carlo trials. Note that the utility bounds in terms of the number of epochs, $E$, can be derived by substituting $T=\lceil \frac{n}{b}\rceil E$ into our main results.
  • Figure 5: Our tighter RDP guarantee for smooth losses during DPSGD-DC over the bounded domain, compared with the trivial bound directly utilizing the post-processing and the Gaussian mechanism property.

Theorems & Definitions (46)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Lemma 2.4
  • Definition 2.5
  • Lemma 2.6
  • Lemma 3.2
  • proof
  • Theorem 3.3
  • proof
  • ...and 36 more