Table of Contents
Fetching ...

Gaussian Differential Privacy

Jinshuo Dong, Aaron Roth, Weijie J. Su

TL;DR

This paper introduces f-differential privacy (f-DP), a hypothesis-testing–based relaxation of differential privacy that captures the full privacy trade-off via trade-off functions, and identifies Gaussian differential privacy (GDP) as a canonical specialization. It shows that composition is closed and losslessly described by tensor products of trade-off functions, with GDP yielding simple, accurate composition via a additive-variance-like rule. By establishing a primal–dual connection to (ε,δ)-DP, the framework enables importing existing DP results and provides a clean subsampling theorem that tightens privacy guarantees beyond traditional (ε,δ)-DP bounds. The authors then apply f-DP to analyze privacy in stochastic gradient descent, deriving both asymptotic GDP-based guarantees and Berry–Esseen-type bounds that yield practical, computationally efficient privacy estimates for iterative private optimization. Overall, f-DP offers a coherent, tractable, and versatile toolkit for private data analysis with strong theoretical and practical implications for modular privacy accounting and private learning workflows.

Abstract

Differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy in the past decade. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analyzing important primitives like privacy amplification by subsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation, which we term `$f$-differential privacy' ($f$-DP). This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations. First, $f$-DP preserves the hypothesis testing interpretation. In addition, $f$-DP allows for lossless reasoning about composition in an algebraic fashion. Moreover, we provide a powerful technique to import existing results proven for original DP to $f$-DP and, as an application, obtain a simple subsampling theorem for $f$-DP. In addition to the above findings, we introduce a canonical single-parameter family of privacy notions within the $f$-DP class that is referred to as `Gaussian differential privacy' (GDP), defined based on testing two shifted Gaussians. GDP is focal among the $f$-DP class because of a central limit theorem we prove. More precisely, the privacy guarantees of \emph{any} hypothesis testing based definition of privacy (including original DP) converges to GDP in the limit under composition. The CLT also yields a computationally inexpensive tool for analyzing the exact composition of private algorithms. Taken together, this collection of attractive properties render $f$-DP a mathematically coherent, analytically tractable, and versatile framework for private data analysis. Finally, we demonstrate the use of the tools we develop by giving an improved privacy analysis of noisy stochastic gradient descent.

Gaussian Differential Privacy

TL;DR

This paper introduces f-differential privacy (f-DP), a hypothesis-testing–based relaxation of differential privacy that captures the full privacy trade-off via trade-off functions, and identifies Gaussian differential privacy (GDP) as a canonical specialization. It shows that composition is closed and losslessly described by tensor products of trade-off functions, with GDP yielding simple, accurate composition via a additive-variance-like rule. By establishing a primal–dual connection to (ε,δ)-DP, the framework enables importing existing DP results and provides a clean subsampling theorem that tightens privacy guarantees beyond traditional (ε,δ)-DP bounds. The authors then apply f-DP to analyze privacy in stochastic gradient descent, deriving both asymptotic GDP-based guarantees and Berry–Esseen-type bounds that yield practical, computationally efficient privacy estimates for iterative private optimization. Overall, f-DP offers a coherent, tractable, and versatile toolkit for private data analysis with strong theoretical and practical implications for modular privacy accounting and private learning workflows.

Abstract

Differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy in the past decade. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analyzing important primitives like privacy amplification by subsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation, which we term `-differential privacy' (-DP). This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations. First, -DP preserves the hypothesis testing interpretation. In addition, -DP allows for lossless reasoning about composition in an algebraic fashion. Moreover, we provide a powerful technique to import existing results proven for original DP to -DP and, as an application, obtain a simple subsampling theorem for -DP. In addition to the above findings, we introduce a canonical single-parameter family of privacy notions within the -DP class that is referred to as `Gaussian differential privacy' (GDP), defined based on testing two shifted Gaussians. GDP is focal among the -DP class because of a central limit theorem we prove. More precisely, the privacy guarantees of \emph{any} hypothesis testing based definition of privacy (including original DP) converges to GDP in the limit under composition. The CLT also yields a computationally inexpensive tool for analyzing the exact composition of private algorithms. Taken together, this collection of attractive properties render -DP a mathematically coherent, analytically tractable, and versatile framework for private data analysis. Finally, we demonstrate the use of the tools we develop by giving an improved privacy analysis of noisy stochastic gradient descent.

Paper Structure

This paper contains 29 sections, 81 theorems, 338 equations, 14 figures, 1 algorithm.

Key Result

Proposition 2.1

A function $f: [0, 1] \rightarrow [0, 1]$ is a trade-off function if and only if $f$ is convex, continuousConvexity itself implies continuity in $(0,1)$ for $f$. In addition, $f(\alpha) \geqslant 0$ and $f(\alpha) \leqslant 1-\alpha$ implies continuity at 1. Hence, the continuity condition only matt

Figures (14)

  • Figure 1: Left: Our central limit theorem based approximation (in blue) is very close to the composition of just $10$ mechanisms (in red). The tightest possible approximation via an $(\varepsilon,\delta)$-DP guarantee (in back) is substantially looser. See \ref{['fig:comp']} for parameter setup. Right: Privacy analysis of stochastic gradient descent used to train a convolutional neural network on MNIST lecun-mnisthandwrittendigit-2010. The $f$-DP framework yields a privacy guarantee (in red) for this problem that is significantly better than the optimal $(\varepsilon,\delta)$-DP guarantee (in black) that is derived from the moments accountant (MA) method deep. Put simply, our analysis shows that stochastic gradient descent releases less sensitive information than expected in the literature. See \ref{['sec:application_in_sgd']} for more plots and details.
  • Figure 2: Three different examples of $T(M(S),M(S'))$. Only the dashed line corresponds to a trade-off function satisfying $f$-DP.
  • Figure 3: Left: $f_{\varepsilon,\delta}$ is a piecewise linear function and is symmetric with respect to the line $y = x$. It has (nontrivial) slopes $-\mathrm{e}^{\pm\varepsilon}$ and intercepts $1-\delta$. Right: Trade-off functions of unit-variance Gaussian distributions with different means. The case of $\mu=0.5$ is reasonably private, $\mu=1$ is borderline private, and $\mu=3$ is basically non-private: an adversary can control type I and type II errors simultaneously at only 0.07. In the case of $\mu=6$ (almost coincides with the axes), the two errors both can be as small as 0.001.
  • Figure 4: Each $(\varepsilon,\delta(\varepsilon))$-DP guarantee corresponds to two supporting linear functions (symmetric to each other) to the trade-off function describing the complete $f$-DP guarantee. In general, characterizing a privacy guarantee using only a subset of $(\varepsilon,\delta)$-DP guarantees (for example, only those with small $\delta$) would result in information loss.
  • Figure 5: Left: Tensoring with $f_{0,\delta}$ scales the graph towards the origin by a factor of $1-\delta$. Right: 10-fold composition of $(1/\sqrt{10},0)$-DP mechanisms, that is, $f_{\varepsilon,0}^{\otimes n}$ with $n=10, \varepsilon=1/\sqrt{n}.$ The dashed curve corresponds to $\varepsilon=2.89,\delta = 0.001$. These values are obtained by first setting $\delta = 0.001$ and finding the smallest $\varepsilon$ such that the composition is $(\varepsilon,\delta)$-DP. Note that the central limit theorem approximation to the true trade-off curve is almost perfect, whereas the tightest possible approximation via $(\varepsilon,\delta)$-DP is substantially looser.
  • ...and 9 more figures

Theorems & Definitions (144)

  • Definition 1.1: DMNS06approxdp
  • Definition 2.1: trade-off function
  • Proposition 2.1
  • Definition 2.2: $f$-differential privacy
  • Proposition 2.2
  • Proposition 2.3: wasserman_zhou
  • Definition 2.4
  • Theorem 2.5
  • proof : Proof of Theorem \ref{['thm:g_mech']}
  • Proposition 2.6
  • ...and 134 more