Table of Contents
Fetching ...

Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification

Jan Schuchardt, Mihail Stoian, Arthur Kosmala, Stephan Günnemann

TL;DR

This work proposes the first general framework for deriving mechanism-specific guarantees, which leverage additional information beyond these parameters to more tightly characterize the subsampled mechanism's privacy.

Abstract

Amplification by subsampling is one of the main primitives in machine learning with differential privacy (DP): Training a model on random batches instead of complete datasets results in stronger privacy. This is traditionally formalized via mechanism-agnostic subsampling guarantees that express the privacy parameters of a subsampled mechanism as a function of the original mechanism's privacy parameters. We propose the first general framework for deriving mechanism-specific guarantees, which leverage additional information beyond these parameters to more tightly characterize the subsampled mechanism's privacy. Such guarantees are of particular importance for privacy accounting, i.e., tracking privacy over multiple iterations. Overall, our framework based on conditional optimal transport lets us derive existing and novel guarantees for approximate DP, accounting with Rényi DP, and accounting with dominating pairs in a unified, principled manner. As an application, we analyze how subsampling affects the privacy of groups of multiple users. Our tight mechanism-specific bounds outperform tight mechanism-agnostic bounds and classic group privacy results.

Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification

TL;DR

This work proposes the first general framework for deriving mechanism-specific guarantees, which leverage additional information beyond these parameters to more tightly characterize the subsampled mechanism's privacy.

Abstract

Amplification by subsampling is one of the main primitives in machine learning with differential privacy (DP): Training a model on random batches instead of complete datasets results in stronger privacy. This is traditionally formalized via mechanism-agnostic subsampling guarantees that express the privacy parameters of a subsampled mechanism as a function of the original mechanism's privacy parameters. We propose the first general framework for deriving mechanism-specific guarantees, which leverage additional information beyond these parameters to more tightly characterize the subsampled mechanism's privacy. Such guarantees are of particular importance for privacy accounting, i.e., tracking privacy over multiple iterations. Overall, our framework based on conditional optimal transport lets us derive existing and novel guarantees for approximate DP, accounting with Rényi DP, and accounting with dominating pairs in a unified, principled manner. As an application, we analyze how subsampling affects the privacy of groups of multiple users. Our tight mechanism-specific bounds outperform tight mechanism-agnostic bounds and classic group privacy results.
Paper Structure (105 sections, 68 theorems, 255 equations, 28 figures)

This paper contains 105 sections, 68 theorems, 255 equations, 28 figures.

Key Result

Proposition 2.5

If mechanism $M : {\mathbb{X}} \rightarrow {\mathbb{R}}^D$ is $(\varepsilon,\delta)$-DP under relation $\simeq_{\mathbb{X}}$, then it is $(K \cdot \varepsilon, K \cdot e^{K \cdot \varepsilon} \cdot \delta)$-DP under group relation $\{(x, x') \in {\mathbb{X}}^2 \mid d_{\mathbb{X}}(x,x') = K\}$.

Figures (28)

  • Figure 1: Group members $x_1',x_2'$ contribute to a dataset, while group member $x_3'$ does not. For small subsampling rates $r$, it is unlikely to access a single ($\Pr = 2r(1-r)$) or even both ($\Pr=r^2$) inserted elements when applying a base mechanism $B$ to a subsampled batch (e.g., the yellow one). This further obfuscates which data was contributed by members of group $\{x'_1, x'_2, x'_3\}$.
  • Figure 2: Mechanism-agnostic guarantees for (a) graph modification daigavane2022nodelevelayle2022trainingzihang2024preserving (b) insertion/removal li2012samplingballe2018couplingsmironov2019poissonzhu2019poisson (c) substitution bun2015differentiallyullman2017balle2018couplingswang2019uniform can be derived from (d) our proposed framework. In (b-c), events $A_i$ and $E_j$ indicate the presence of inserted or substituted elements.
  • Figure 3: Randomized response with WOR subsampling ($q \mathbin{/} N = 0.001$), group size $1$, and varying true response probability $\theta$.
  • Figure 4: Laplace mechanisms with scale $\lambda = 1$, Poisson subsampling ($r=0.2$), and varying group size.
  • Figure 5: Gaussian mechanisms with standard deviation $\sigma = 2$, Poisson subsampling ($r=0.2$), and varying group size.
  • ...and 23 more figures

Theorems & Definitions (141)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.2
  • Definition 2.3
  • Example 2.4
  • Proposition 2.5: vadhan2017complexity
  • Lemma 3.1
  • Definition 3.2
  • Theorem 3.3
  • Theorem 3.4
  • ...and 131 more