The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

Thomas Steinke; Milad Nasr; Arun Ganesh; Borja Balle; Christopher A. Choquette-Choo; Matthew Jagielski; Jamie Hayes; Abhradeep Guha Thakurta; Adam Smith; Andreas Terzis

The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

Thomas Steinke, Milad Nasr, Arun Ganesh, Borja Balle, Christopher A. Choquette-Choo, Matthew Jagielski, Jamie Hayes, Abhradeep Guha Thakurta, Adam Smith, Andreas Terzis

TL;DR

The paper tackles the gap between theory and practice in differential privacy for DP-SGD by focusing on the last-iterate_only setting, where only the final model is released. It proposes a linearized heuristic that yields a computable DP bound with $P=\mathsf{Binomial}(T,q)+\mathcal{N}(0,\sigma^2 T)$ and $Q=\mathcal{N}(0,\sigma^2 T)$, producing $\delta=\max\{H_{e^{\varepsilon}}(P,Q),H_{e^{\varepsilon}}(Q,P)\}$ and enabling practical privacy prediction before training. Through extensive experiments on CIFAR-10 and language-model finetuning (Gemma 2 on PersonaChat), the authors show the heuristic upper-bounds and tracks empirical privacy auditing results in vision and NLP, highlighting its predictive value while noting counterexamples where linearity assumptions fail. They compare against standard composition and full-batch baselines, demonstrating a substantial gap between conservative theory and empirical leakage, and argue that the heuristic can guide hyperparameter selection and auditing efforts as a realistic target for improvement. The work thus provides a practical tool to estimate privacy leakage prior to training and establishes a benchmark for future theoretical and auditing advances to narrow the gap with real-world leakage.

Abstract

We propose a simple heuristic privacy analysis of noisy clipped stochastic gradient descent (DP-SGD) in the setting where only the last iterate is released and the intermediate iterates remain hidden. Namely, our heuristic assumes a linear structure for the model. We show experimentally that our heuristic is predictive of the outcome of privacy auditing applied to various training procedures. Thus it can be used prior to training as a rough estimate of the final privacy leakage. We also probe the limitations of our heuristic by providing some artificial counterexamples where it underestimates the privacy leakage. The standard composition-based privacy analysis of DP-SGD effectively assumes that the adversary has access to all intermediate iterates, which is often unrealistic. However, this analysis remains the state of the art in practice. While our heuristic does not replace a rigorous privacy analysis, it illustrates the large gap between the best theoretical upper bounds and the privacy auditing lower bounds and sets a target for further work to improve the theoretical privacy analyses. We also empirically support our heuristic and show existing privacy auditing attacks are bounded by our heuristic analysis in both vision and language tasks.

The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

TL;DR

and

, producing

and enabling practical privacy prediction before training. Through extensive experiments on CIFAR-10 and language-model finetuning (Gemma 2 on PersonaChat), the authors show the heuristic upper-bounds and tracks empirical privacy auditing results in vision and NLP, highlighting its predictive value while noting counterexamples where linearity assumptions fail. They compare against standard composition and full-batch baselines, demonstrating a substantial gap between conservative theory and empirical leakage, and argue that the heuristic can guide hyperparameter selection and auditing efforts as a realistic target for improvement. The work thus provides a practical tool to estimate privacy leakage prior to training and establishes a benchmark for future theoretical and auditing advances to narrow the gap with real-world leakage.

Abstract

Paper Structure (21 sections, 1 theorem, 13 equations, 6 figures, 2 tables, 3 algorithms)

This paper contains 21 sections, 1 theorem, 13 equations, 6 figures, 2 tables, 3 algorithms.

Introduction
Background & Related Work
Our Contributions
Linearized Heuristic Privacy Analysis
Baselines
Empirical Evaluation via Privacy Auditing
Image Classification Experiments
Language Model Experiments
Counterexamples
Warmup: Zeroing Out The Model Weights
Linear Loss $+$ Quadratic Regularizer
Pathological Example
Malicious Dataset Attack
Discussion & Implications
Conclusion
...and 6 more sections

Key Result

Theorem 1

Let $\mathbf{x},T,q,\eta,\sigma,\ell,r$ be as in Algorithm alg:dpsgd. Assume $r$ and $\ell(\cdot,x)$, for every $x\in \mathcal{X}$, are linear. Letting DP-SGD with last_iterate_only satisfies $(\varepsilon,\delta)$-differential privacy with $\varepsilon\ge0$ arbitrary and Here, $H_{e^\varepsilon}$ denotes the $e^\varepsilon$-hockey-stick-divergence $H_{e^\varepsilon}(P, Q) := \sup_S P(S) - e^\va

Figures (6)

Figure 1: Comparison of our heuristic to baselines in various parameter regimes. Horizontal axis is the number of iterations $T$ and the vertical axis is $\varepsilon$ such that we have $(\varepsilon,10^{-6})$-DP.
Figure 2: Black-box gradient space attacks fail to achieve tight auditing when other data points are sampled from the data distribution. Heuristic and standard bounds diverge from empirical results, indicating the attack's ineffectiveness. This contrasts with previous work which tightly auditing with access to intermediate updates.
Figure 3: For gradient space attacks with adversarial datasets, the empirical epsilon ($\varepsilon$) closely tracks the final epsilon except for at small step counts, where distinguishing is more challenging. This is evident at both subsampling probability values we study ($q=0.01$ and $q=0.1$).
Figure 4: Input space attacks show promising results with both natural and blank image settings, although blank images have higher attack success. These input space attacks achieve tighter results than gradient space attacks in the natural data setting, in contrast to findings from prior work.
Figure 5: Ratio of upper bound on $\varepsilon$ for quadratic loss with $\alpha = 0.5$ divided by maximum $\varepsilon$ of $i$ iterations on a linear loss. In Figure \ref{['fig:hqr_eps1']} (resp. Figure \ref{['fig:hqr_eps2']}), for each choice of $q$, $\sigma$ is set so 1 iteration of DP-SGD satisfies $(1, 10^{-6})$-DP (resp $(2, 10^{-6})$-DP).
...and 1 more figures

Theorems & Definitions (2)

Theorem 1: Privacy of DP-SGD for linear losses
proof

The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

TL;DR

Abstract

The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (2)