Approximation of Pufferfish Privacy for Gaussian Priors

Ni Ding

Approximation of Pufferfish Privacy for Gaussian Priors

Ni Ding

TL;DR

This work addresses enforcing ($\epsilon$,$\delta$)-pufferfish privacy when adversaries hold Gaussian priors over released statistics. It employs Monge's optimal transport plan to calibrate Laplace noise to differences in both mean and variance across secret pairs, providing a concrete privacy-utility trade-off for Gaussian priors and extending to Gaussian mixtures. The authors derive explicit sufficient conditions (bounds on the Laplace scale $b$) for single-query and summation queries in multi-user settings, and validate the approach with real datasets (Adult and Hungarian heart disease) to illustrate practical applicability. They further discuss extensions to exponential and Gaussian mechanisms and outline potential refinements to tighten bounds and improve utility. Overall, the paper advances a practical privacy design for continuous priors with Gaussian-or mixture-model structure within the pufferfish framework.

Abstract

This paper studies how to approximate pufferfish privacy when the adversary's prior belief of the published data is Gaussian distributed. Using Monge's optimal transport plan, we show that $(ε, δ)$-pufferfish privacy is attained if the additive Laplace noise is calibrated to the differences in mean and variance of the Gaussian distributions conditioned on every discriminative secret pair. A typical application is the private release of the summation (or average) query, for which sufficient conditions are derived for approximating $ε$-statistical indistinguishability in individual's sensitive data. The result is then extended to arbitrary prior beliefs trained by Gaussian mixture models (GMMs): calibrating Laplace noise to a convex combination of differences in mean and variance between Gaussian components attains $(ε,δ)$-pufferfish privacy.

Approximation of Pufferfish Privacy for Gaussian Priors

TL;DR

This work addresses enforcing (

)-pufferfish privacy when adversaries hold Gaussian priors over released statistics. It employs Monge's optimal transport plan to calibrate Laplace noise to differences in both mean and variance across secret pairs, providing a concrete privacy-utility trade-off for Gaussian priors and extending to Gaussian mixtures. The authors derive explicit sufficient conditions (bounds on the Laplace scale

) for single-query and summation queries in multi-user settings, and validate the approach with real datasets (Adult and Hungarian heart disease) to illustrate practical applicability. They further discuss extensions to exponential and Gaussian mechanisms and outline potential refinements to tighten bounds and improve utility. Overall, the paper advances a practical privacy design for continuous priors with Gaussian-or mixture-model structure within the pufferfish framework.

Abstract

This paper studies how to approximate pufferfish privacy when the adversary's prior belief of the published data is Gaussian distributed. Using Monge's optimal transport plan, we show that

-pufferfish privacy is attained if the additive Laplace noise is calibrated to the differences in mean and variance of the Gaussian distributions conditioned on every discriminative secret pair. A typical application is the private release of the summation (or average) query, for which sufficient conditions are derived for approximating

-statistical indistinguishability in individual's sensitive data. The result is then extended to arbitrary prior beliefs trained by Gaussian mixture models (GMMs): calibrating Laplace noise to a convex combination of differences in mean and variance between Gaussian components attains

-pufferfish privacy.

Paper Structure (24 sections, 4 theorems, 33 equations, 5 figures)

This paper contains 24 sections, 4 theorems, 33 equations, 5 figures.

Introduction
Our Contributions
Related Works
Notation
Organization
PRELIMINARIES
Privatization mechanism
Monge's Optimal Transport Plan $\hat{\pi}$
GAUSSIAN PRIORS
Special Case: $\ell_1$-sensitivity Method for Differential Privacy
Summation query in $K$-independent user system
GMM PRIORS
Experiment
DISCUSSION
Tighter Bound on \ref{['eq:mainInEqAux']}
...and 9 more sections

Key Result

Theorem 1

For $X|s_i \sim \mathcal{G}(\mu_i, \sigma_i)$ and $X|s_j \sim \mathcal{G}(\mu_j,\sigma_j)$ for all $(s_i,s_j) \in \mathbb{S}$, adding Laplace noise $N \sim \mathcal{L} (b)$ with attains $(\epsilon, \delta)$-pufferfish private on $\mathbb{S}$ in $Y$.

Figures (5)

Figure 1: For the original data $X$ and $X'$ in (a) that is normal distributed with different mean and variance, (b) shows the resulting probability density of $Y = X + N$ and $Y' = X' + N$ for Laplace noise $N \sim \mathcal{L}(4)$, where the maximum logarithmic difference in probability density is $\max_{y} \left| \log \frac{P_{Y}(y)}{P_{Y'}(y)} \right| = 0.2992$. (c) shows the resulting probability density of $Y$ and $Y'$ for Gaussian noise $N \sim \mathcal{G}(0,8)$, where $\max_{y} \left| \log \frac{P_{Y}(y)}{P_{Y'}(y)} \right| = 0.0156$. Note, the Laplace noise in (b) and Gaussian noise in (c) have the same variance.
Figure 2: For the $K$-independent and identical user system, the lower bound on $b$ in \ref{['eq:KIndLapC']} for attaining $(1, 0.3)$-pufferfish privacy with Laplace noise $N \sim \mathcal{L}(b)$ as the number of users $K$ increases. We set the mean $\mu = 1$ and vary the variance $\sigma^2$ from $1$ to $25$.
Figure 3: The Adult dataset in UCI machine learning repository UCI2007: $X$ and $S$ denote the attributes education-num and race, respectively. To attain the statistical indistinguishability between secrets $s_i=$"race is Black" and $s_j=$"race is Asian-Pac-Islander", the privatized data $Y = X + N$ is generated, where in Laplace noise $N\sim\mathcal{L}(b)$ is calibrated by Theorem \ref{['theo:LapGaussMix']} based on the GMM fitting for attaining $(1,0.5)$-pufferfish privacy and $(1,0.3)$-pufferfish privacy.
Figure 4: The Hungarian heart disease dataset in UCI machine learning repository UCI2007: $X$ and $S$ denote the attributes chol, the cholesterol level, and sex, respectively. To attain the statistical indistinguishability between secrets $s_i=$"sex is female" and $s_j=$"sex is male", the privatized data $Y = X + N$ is generated, where in Laplace noise $N\sim\mathcal{L}(b)$ is calibrated by Theorem \ref{['theo:LapGaussMix']} based on the GMM fitting for attaining $(1,0.5)$-pufferfish privacy and $(1,0.3)$-pufferfish privacy.
Figure 5: The value of $\tau^*(\delta)$ in \ref{['eq:TauStar1']} when $\delta$ varies from $0.001$ to $0.999$.

Theorems & Definitions (6)

Definition 1: $(\epsilon,\delta)$-pufferfish privacy
Theorem 1
Remark 1: Translation priors
Corollary 1
Theorem 2
Corollary 2

Approximation of Pufferfish Privacy for Gaussian Priors

TL;DR

Abstract

Approximation of Pufferfish Privacy for Gaussian Priors

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (6)