Table of Contents
Fetching ...

Privacy Amplification for the Gaussian Mechanism via Bounded Support

Shengyuan Hu, Saeed Mahloujifar, Virginia Smith, Kamalika Chaudhuri, Chuan Guo

TL;DR

The paper tackles the conservatism of worst-case differential privacy by leveraging data-dependent accounting frameworks (pDP and FIL). It introduces Gaussian mechanisms with bounded support—specifically rectified and truncated Gaussian—and analyzes their privacy amplification under both FIL and per-instance Rényi DP, demonstrating that the amplification is stronger in the tails of the input. The authors derive closed-form expressions for FIL and RDP for these bounded mechanisms, establish coordinate-wise tensorization to handle high-dimensional data, and discuss subsampling challenges. Empirical results on private SGD show meaningful privacy-cost reductions (up to $>30\%$ in some settings) with little to no loss in utility on benchmarks like CIFAR-10/100, validating the practical value of data-dependent privacy amplification via bounded noise. The work suggests a promising direction for tighter, data-aware privacy guarantees in real-world ML systems and highlights future avenues such as per-coordinate subsampling and deeper integration with compression-inspired methods.

Abstract

Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset. These guarantees can be desirable compared to vanilla DP in real world settings as they tightly upper-bound the privacy leakage for a $\textit{specific}$ individual in an $\textit{actual}$ dataset, rather than considering worst-case datasets. While these frameworks are beginning to gain popularity, to date, there is a lack of private mechanisms that can fully leverage advantages of data-dependent accounting. To bridge this gap, we propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting. Experiments on model training with DP-SGD show that using bounded support Gaussian mechanisms can provide a reduction of the pDP bound $ε$ by as much as 30% without negative effects on model utility.

Privacy Amplification for the Gaussian Mechanism via Bounded Support

TL;DR

The paper tackles the conservatism of worst-case differential privacy by leveraging data-dependent accounting frameworks (pDP and FIL). It introduces Gaussian mechanisms with bounded support—specifically rectified and truncated Gaussian—and analyzes their privacy amplification under both FIL and per-instance Rényi DP, demonstrating that the amplification is stronger in the tails of the input. The authors derive closed-form expressions for FIL and RDP for these bounded mechanisms, establish coordinate-wise tensorization to handle high-dimensional data, and discuss subsampling challenges. Empirical results on private SGD show meaningful privacy-cost reductions (up to in some settings) with little to no loss in utility on benchmarks like CIFAR-10/100, validating the practical value of data-dependent privacy amplification via bounded noise. The work suggests a promising direction for tighter, data-aware privacy guarantees in real-world ML systems and highlights future avenues such as per-coordinate subsampling and deeper integration with compression-inspired methods.

Abstract

Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset. These guarantees can be desirable compared to vanilla DP in real world settings as they tightly upper-bound the privacy leakage for a individual in an dataset, rather than considering worst-case datasets. While these frameworks are beginning to gain popularity, to date, there is a lack of private mechanisms that can fully leverage advantages of data-dependent accounting. To bridge this gap, we propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting. Experiments on model training with DP-SGD show that using bounded support Gaussian mechanisms can provide a reduction of the pDP bound by as much as 30% without negative effects on model utility.
Paper Structure (28 sections, 14 theorems, 47 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 28 sections, 14 theorems, 47 equations, 4 figures, 3 tables, 1 algorithm.

Key Result

Lemma 4.1

The FIL of $\theta^T\sim\mathcal{N}^T\left(\theta,\sigma^2,[a,b]\right)$ is given by $\eta=\eta^T\|J_f\|_2$ where The FIL of $\theta^R\sim\mathcal{N}^R\left(\theta,\sigma^2,[a,b]\right)$ is given by $\eta=\eta^R\|J_f\|_2$ where

Figures (4)

  • Figure 1: FIL $\eta$ and per-instance RDP $\epsilon$ for different variants of the Gaussian mechanism with input $\theta$. Compared to vanilla Gaussian mechanism, rectified Gaussian, truncated Gaussian and stochastic sign enjoy stronger privacy guarantee, especially when $|\theta|$ is large. Bounded support set $\mathcal{B}=[-1,1]$ for all figures. Sensitivity is controlled at 1 for RDP results.
  • Figure 2: Biasedness-privacy-utility tradeoffs for synthetic mean estimation. Left: Relation between biasedness and privacy amplification. Smaller $\epsilon_{\mathcal{N}^R}/\epsilon_{\mathcal{N}}$ means stronger amplification. Right: Relation between biasedness and utility change corresponding to the same hyperparameter combinations shown in the Left figure.
  • Figure 3: Comparison between bounded Gaussian mechanism and Gaussian mechanism in terms of pDP-for-all utility tradeoff on three datasets. The $y$-axis is the ratio between bounded Gaussian pDP $\epsilon$ and Gaussian pDP $\epsilon$. The gray dotted line represents no amplification. We limit the true accuracy for both mechanisms to be no more than $1\%$ higher than the target accuracy reported in the figure.
  • Figure 4: Comparison between bounded Gaussian mechanism and Gaussian mechanism in terms of FIL-utility tradeoff.

Theorems & Definitions (30)

  • Definition 2.1: Differential Privacy; DworkR14
  • Definition 2.2: Per-instance Differential Privacy (pDP) for all; wang2019per
  • Definition 2.3: Per-instance Rényi Differential Privacy for all
  • Definition 2.4: Fisher information loss; hannun2021measuring
  • Definition 3.1
  • Definition 3.2
  • Definition 3.3: Bounded Gaussian mechanism
  • Lemma 4.1
  • Theorem 4.2
  • Lemma 4.3: Proposition 7 from mironov2017renyi
  • ...and 20 more