Pufferfish Privacy: An Information-Theoretic Study

Theshani Nuradha; Ziv Goldfeld

Pufferfish Privacy: An Information-Theoretic Study

Theshani Nuradha, Ziv Goldfeld

TL;DR

This work generalizes differential privacy through Pufferfish privacy and introduces an information-theoretic formulation, ε-MI PP, to quantify privacy with domain knowledge via conditional mutual information constraints. It develops a structured PP framework using private/public function pairs connected by a bipartite graph, proving that ε-MI PP lies between ε-PP and (ε,δ)-PP and establishing properties such as convexity, post-processing, and composability. The paper then designs noise mechanisms (Laplace, Gaussian) with variance- or covariance-based bounds that guarantee ε-MI PP, and proposes projection-based methods to handle high-dimensional queries, along with explicit dependence on private functions. Auditing tools based on sliced mutual information (SMI) enable practical privacy verification and PP auditing in high dimensions, complemented by applications to private mean estimation and algorithmic stability. Together, these results offer flexible privacy-utility tradeoffs that exploit distributional knowledge, along with scalable auditing and applicability to modern private inference and learning tasks.

Abstract

Pufferfish privacy (PP) is a generalization of differential privacy (DP), that offers flexibility in specifying sensitive information and integrates domain knowledge into the privacy definition. Inspired by the illuminating formulation of DP in terms of mutual information due to Cuff and Yu, this work explores PP through the lens of information theory. We provide an information-theoretic formulation of PP, termed mutual information PP (MI PP), in terms of the conditional mutual information between the mechanism and the secret, given the public information. We show that MI PP is implied by the regular PP and characterize conditions under which the reverse implication is also true, recovering the relationship between DP and its information-theoretic variant as a special case. We establish convexity, composability, and post-processing properties for MI PP mechanisms and derive noise levels for the Gaussian and Laplace mechanisms. The obtained mechanisms are applicable under relaxed assumptions and provide improved noise levels in some regimes. Lastly, applications to auditing privacy frameworks, statistical inference tasks, and algorithm stability are explored.

Pufferfish Privacy: An Information-Theoretic Study

TL;DR

Abstract

Paper Structure (40 sections, 20 theorems, 99 equations, 2 figures, 1 algorithm)

This paper contains 40 sections, 20 theorems, 99 equations, 2 figures, 1 algorithm.

Introduction
Pufferfish Privacy
Contributions
Related work
Organization
Background and Preliminaries
Notation
Differential Privacy
Pufferfish Privacy
Pufferfish Privacy and Mutual Information
Structured Pufferfish Privacy Framework
Information-Theoretic Formulation
Properties of $\epsilon$-MI PP
Mechanisms
Laplace Mechanism
...and 25 more sections

Key Result

Theorem 1

Consider the structured $(\epsilon,\delta)$-PP framework $(\mathcal{G},\mathcal{W},\mathcal{E},\Theta)$ from Definition def:specialized_PP. Let $\epsilon'>0$ be arbitrary and set $\epsilon"=\epsilon\wedge \frac{1}{2} \epsilon^2$. Then and if $\Theta=\mathcal{P}(\mathcal{X}^{n \times k})$, then we further have Moreover, the inverse implication holds under either of the following conditions:

Figures (2)

Figure 1: 2022-2033 salary data in four departments: HR, IT, PR, R&D. The goal is to publish the average 2023 salary in each department (the average of the blue cells) while hiding whether the number raises (marked by red frames) is $\leq 2$ corresponding to $g(\cdot)=0$ or $>2$ corresponding to $g(\cdot)=1$. The average 2022 salaries (yellow cells) are public knowledge.
Figure 3: (a) Laplace and Gaussian noise variance injected to achieve $\epsilon$-MI DP for the following setting where $f$ is the average of each column of the database in the space $\{0,1\}^{n \times d}$ with fixed $n=100$ and varying $d=1,2,5,10,30$. (b) The region where the MI DP Gaussian noise injection mechanism adds noise with smaller variance compared to classical mechanism for achieving $(\epsilon',\sqrt{2 \epsilon})$-DP with $d=30$

Theorems & Definitions (48)

Definition 1: Differential privacy
Definition 2: $\epsilon$-MI DP
Definition 3: Pufferfish privacy
Definition 4: Structured PP framework
Remark 1: Semantics of the structured PP framework
Remark 2: Special cases
Definition 5: $\epsilon$-MI PP
Remark 3: Revisiting semantics of the structured PP
Theorem 1: Relative strength
Remark 4: $\epsilon$-KL PP
...and 38 more

Pufferfish Privacy: An Information-Theoretic Study

TL;DR

Abstract

Pufferfish Privacy: An Information-Theoretic Study

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (48)