Table of Contents
Fetching ...

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere, Aurélien Bellet, Nicolas Papernot

TL;DR

The paper investigates DP-SGD under the hidden state threat model, where only the final model is released. It introduces gradient-crafting adversaries that predefine gradient sequences to maximize final-model privacy loss, without relying on intermediate checkpoints. The main finding is that with per-iteration insertion ($k=1$), empirical auditing matches the theoretical upper bounds, indicating no privacy amplification from concealing updates in this regime, while for larger periods and non-convex settings, gaps persist but can be mitigated by adversaries that also shape the loss landscape; overall, the work advances understanding of privacy guarantees in hidden-state DP-SGD and motivates tighter accounting in non-convex regimes. The results highlight regime-dependent privacy amplification and offer pathways to tighter upper bounds and more accurate auditing.

Abstract

Machine learning models can be trained with formal privacy guarantees via differentially private optimizers such as DP-SGD. In this work, we focus on a threat model where the adversary has access only to the final model, with no visibility into intermediate updates. In the literature, this hidden state threat model exhibits a significant gap between the lower bound from empirical privacy auditing and the theoretical upper bound provided by privacy accounting. To challenge this gap, we propose to audit this threat model with adversaries that craft a gradient sequence designed to maximize the privacy loss of the final model without relying on intermediate updates. Our experiments show that this approach consistently outperforms previous attempts at auditing the hidden state model. Furthermore, our results advance the understanding of achievable privacy guarantees within this threat model. Specifically, when the crafted gradient is inserted at every optimization step, we show that concealing the intermediate model updates in DP-SGD does not enhance the privacy guarantees. The situation is more complex when the crafted gradient is not inserted at every step: our auditing lower bound matches the privacy upper bound only for an adversarially-chosen loss landscape and a sufficiently large batch size. This suggests that existing privacy upper bounds can be improved in certain regimes.

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

TL;DR

The paper investigates DP-SGD under the hidden state threat model, where only the final model is released. It introduces gradient-crafting adversaries that predefine gradient sequences to maximize final-model privacy loss, without relying on intermediate checkpoints. The main finding is that with per-iteration insertion (), empirical auditing matches the theoretical upper bounds, indicating no privacy amplification from concealing updates in this regime, while for larger periods and non-convex settings, gaps persist but can be mitigated by adversaries that also shape the loss landscape; overall, the work advances understanding of privacy guarantees in hidden-state DP-SGD and motivates tighter accounting in non-convex regimes. The results highlight regime-dependent privacy amplification and offer pathways to tighter upper bounds and more accurate auditing.

Abstract

Machine learning models can be trained with formal privacy guarantees via differentially private optimizers such as DP-SGD. In this work, we focus on a threat model where the adversary has access only to the final model, with no visibility into intermediate updates. In the literature, this hidden state threat model exhibits a significant gap between the lower bound from empirical privacy auditing and the theoretical upper bound provided by privacy accounting. To challenge this gap, we propose to audit this threat model with adversaries that craft a gradient sequence designed to maximize the privacy loss of the final model without relying on intermediate updates. Our experiments show that this approach consistently outperforms previous attempts at auditing the hidden state model. Furthermore, our results advance the understanding of achievable privacy guarantees within this threat model. Specifically, when the crafted gradient is inserted at every optimization step, we show that concealing the intermediate model updates in DP-SGD does not enhance the privacy guarantees. The situation is more complex when the crafted gradient is not inserted at every step: our auditing lower bound matches the privacy upper bound only for an adversarially-chosen loss landscape and a sufficiently large batch size. This suggests that existing privacy upper bounds can be improved in certain regimes.
Paper Structure (22 sections, 2 theorems, 9 equations, 16 figures, 2 tables, 6 algorithms)

This paper contains 22 sections, 2 theorems, 9 equations, 16 figures, 2 tables, 6 algorithms.

Key Result

Lemma 1

Assume that an adversary has error rates $\alpha_M$, $\beta_M$ in the binary test defined by $P=M(D)$ and $Q=M(D')$ where $M$ is a mechanism and $D$ and $D'$ are two neighboring datasets. Then, if $M$ satisfies $\mu$-GDP, then

Figures (16)

  • Figure 1: Auditing results for $\mathcal{A}_{GC}$ (ours) and $\mathcal{A}_L$ on ConvNet (Fig. \ref{['fig:k1_sub1']}) and ResNets (Fig. \ref{['fig:k1_sub2']}) at periodicity $k=1$ and $C \in \{1, 2, 4\}$. In Fig. \ref{['fig:low_dim_k=1']} we present the results for $\mathcal{A}_{GC}$-R (ours), $\mathcal{A}_{GC}$-S (ours) and $\mathcal{A}_{L}$ on FCNN (Housing dataset) at periodicity $k=1$.
  • Figure 2: Auditing results for $\mathcal{A}_{GC}$(ours) and $\mathcal{A}_L$ on ConvNet and ResNet (CIFAR10) at privacy parameters $C=1$ at periodicity $k=5$ (Figure \ref{['fig:k5_sub1']}) and $k=25$ (Figure \ref{['fig:k25_sub2']}).
  • Figure 3: Auditing performance of our adversary $\mathcal{A}^{h^*}_S$ across $T=25$ steps. Figure \ref{['fig:auditing_t=2']} shows the evolution of the auditing performance across time for $\sigma \in \{1, 8\}$ and batch size $B \in \{1, 2, 4, 8, 16\}.$ Figure \ref{['fig:amplification_rate']} gives the privacy amplification rate, i.e., the ratio $\hat{\varepsilon}_{t=25} / \hat{\varepsilon}_{t=1}$ between the privacy auditing lower bounds at step 25 $(\hat{\varepsilon}_{t=25})$ and at step 1 $(\hat{\varepsilon}_{t=1})$.
  • Figure 4: Auditing results of $\mathcal{A}_L$, $\mathcal{A}_{GC}$-R and $\mathcal{A}_{GC}$-S on the Housing dataset. We consider 4 variants of $\mathcal{A}_{GC}$-S, depending on whether the noisy or noiseless simulation is used and whether we rank dimensions based on accumulating per-step updates (PS) or on the final model norm difference (FM).
  • Figure 5: Auditing results for $\mathcal{A}_{GC}$-S compared to $\mathcal{A}_{GC}$-R at $k \in \{1, 5\}$ on ConvNet and ResNet18. We observe that the two adversaries are equivalent for these over-parameterized models, demonstrating that $\mathcal{A}_{GC}$-S only enhances our attack against under-parameterized models.
  • ...and 11 more figures

Theorems & Definitions (12)

  • Definition 1: ($\varepsilon, \delta)$-Differential Privacy
  • Remark 1: On the impact of known initialization
  • Remark 2: On pre-trained models
  • Remark 3
  • Example 1: Constant $g$
  • Remark 4: Comparison to parallel work
  • Definition 2: Error rates
  • Definition 3: Trade-off functions
  • Definition 4: $f$-Differential Privacy
  • Definition 5: Gaussian Differential Privacy
  • ...and 2 more