Auditing Differentially Private Machine Learning: How Private is Private SGD?
Matthew Jagielski, Jonathan Ullman, Alina Oprea
TL;DR
This work interrogates whether DP-SGD's formal privacy guarantees reflect practical protection, proposing an empirical auditing framework based on data poisoning to derive lower bounds on the privacy parameter $\varepsilon$. By constructing poisoned datasets and using a classifier to distinguish DP-SGD outputs, the authors estimate $\varepsilon_{LB}$ and show it can be substantially tighter than prior privacy-attack estimates, revealing real-world privacy gaps. Central to their approach is ClipBKD, a clipping-aware backdoor poisoning attack that remains effective under gradient clipping and outperforms standard membership-inference and backdoor methods across multiple datasets. The results suggest that empirical attack-based auditing can complement analytical DP analyses, guiding parameter choices and highlighting the nuanced roles of clipping, initialization, and data distribution in practical privacy. The paper also demonstrates how the knowledge of $\varepsilon_{LB}$ can inform deployments by illustrating concrete privacy risks and prompting further theoretical refinements.
Abstract
We investigate whether Differentially Private SGD offers better privacy in practice than what is guaranteed by its state-of-the-art analysis. We do so via novel data poisoning attacks, which we show correspond to realistic privacy attacks. While previous work (Ma et al., arXiv 2019) proposed this connection between differential privacy and data poisoning as a defense against data poisoning, our use as a tool for understanding the privacy of a specific mechanism is new. More generally, our work takes a quantitative, empirical approach to understanding the privacy afforded by specific implementations of differentially private algorithms that we believe has the potential to complement and influence analytical work on differential privacy.
