Table of Contents
Fetching ...

Auditing Differentially Private Machine Learning: How Private is Private SGD?

Matthew Jagielski, Jonathan Ullman, Alina Oprea

TL;DR

This work interrogates whether DP-SGD's formal privacy guarantees reflect practical protection, proposing an empirical auditing framework based on data poisoning to derive lower bounds on the privacy parameter $\varepsilon$. By constructing poisoned datasets and using a classifier to distinguish DP-SGD outputs, the authors estimate $\varepsilon_{LB}$ and show it can be substantially tighter than prior privacy-attack estimates, revealing real-world privacy gaps. Central to their approach is ClipBKD, a clipping-aware backdoor poisoning attack that remains effective under gradient clipping and outperforms standard membership-inference and backdoor methods across multiple datasets. The results suggest that empirical attack-based auditing can complement analytical DP analyses, guiding parameter choices and highlighting the nuanced roles of clipping, initialization, and data distribution in practical privacy. The paper also demonstrates how the knowledge of $\varepsilon_{LB}$ can inform deployments by illustrating concrete privacy risks and prompting further theoretical refinements.

Abstract

We investigate whether Differentially Private SGD offers better privacy in practice than what is guaranteed by its state-of-the-art analysis. We do so via novel data poisoning attacks, which we show correspond to realistic privacy attacks. While previous work (Ma et al., arXiv 2019) proposed this connection between differential privacy and data poisoning as a defense against data poisoning, our use as a tool for understanding the privacy of a specific mechanism is new. More generally, our work takes a quantitative, empirical approach to understanding the privacy afforded by specific implementations of differentially private algorithms that we believe has the potential to complement and influence analytical work on differential privacy.

Auditing Differentially Private Machine Learning: How Private is Private SGD?

TL;DR

This work interrogates whether DP-SGD's formal privacy guarantees reflect practical protection, proposing an empirical auditing framework based on data poisoning to derive lower bounds on the privacy parameter . By constructing poisoned datasets and using a classifier to distinguish DP-SGD outputs, the authors estimate and show it can be substantially tighter than prior privacy-attack estimates, revealing real-world privacy gaps. Central to their approach is ClipBKD, a clipping-aware backdoor poisoning attack that remains effective under gradient clipping and outperforms standard membership-inference and backdoor methods across multiple datasets. The results suggest that empirical attack-based auditing can complement analytical DP analyses, guiding parameter choices and highlighting the nuanced roles of clipping, initialization, and data distribution in practical privacy. The paper also demonstrates how the knowledge of can inform deployments by illustrating concrete privacy risks and prompting further theoretical refinements.

Abstract

We investigate whether Differentially Private SGD offers better privacy in practice than what is guaranteed by its state-of-the-art analysis. We do so via novel data poisoning attacks, which we show correspond to realistic privacy attacks. While previous work (Ma et al., arXiv 2019) proposed this connection between differential privacy and data poisoning as a defense against data poisoning, our use as a tool for understanding the privacy of a specific mechanism is new. More generally, our work takes a quantitative, empirical approach to understanding the privacy afforded by specific implementations of differentially private algorithms that we believe has the potential to complement and influence analytical work on differential privacy.

Paper Structure

This paper contains 20 sections, 4 theorems, 15 equations, 4 figures, 3 tables, 6 algorithms.

Key Result

Lemma 1

Let $D_0, D_1$ be two datasets differing on at most $k$ rows, $\mathcal{A}$ is an $(\varepsilon, \delta)$-differentially private algorithm, and $\mathcal{O}$ an arbitrary output set. Then

Figures (4)

  • Figure 1: The distribution of gradients from an iteration of DP-SGD under a clean dataset (blue ellipse) and a poisoned dataset (red ellipse). The right pair depicts traditional backdoors while the left pair depicts our backdoors. Our attack pushes in the direction of least variance, so is impacted less by gradient clipping, which is indicated by the two distributions overlapping less.
  • Figure 2: Performance of privacy attacks MI, Backdoor, and ClipBKD on our datasets. LR = logistic regression, FNN = two-layer neural network. Backdoor attacks have not been developed for Purchase-100, so only MI and Clip-BKD were run. Backdoors do not provide positive $\varepsilon_{\mathit{LB}}$ on CIFAR10 due to difficulty with the pretrained model.
  • Figure 3: $f_0(x)$ and $f_1(x)$, as defined in Lemma \ref{['thm:complement']} with $\delta=10^{-5}$, $k=4$, $p_0=0.6$, $p_1=0.8$.
  • Figure : DP-SGD

Theorems & Definitions (8)

  • Definition 2.1: DworkMNS06
  • Lemma 1: Group Privacy
  • Theorem 2
  • proof : Proof of Theorem \ref{['thm:generic']}
  • Lemma 3
  • proof
  • Theorem 4
  • proof