Table of Contents
Fetching ...

Formal Privacy Guarantees with Invariant Statistics

Young Hyun Cho, Jordan Awan

TL;DR

This paper's framework, Semi-DP, redefines adjacency by focusing on datasets that conform to the given invariant, ensuring indistinguishability between adjacent datasets within invariant-conforming datasets, and develops customized mechanisms that satisfy Semi-DP.

Abstract

Motivated by the 2020 US Census products, this paper extends differential privacy (DP) to address the joint release of DP outputs and nonprivate statistics, referred to as invariant. Our framework, Semi-DP, redefines adjacency by focusing on datasets that conform to the given invariant, ensuring indistinguishability between adjacent datasets within invariant-conforming datasets. We further develop customized mechanisms that satisfy Semi-DP, including the Gaussian mechanism and the optimal $K$-norm mechanism for rank-deficient sensitivity spaces. Our framework is applied to contingency table analysis which is relevant to the 2020 US Census, illustrating how Semi-DP enables the release of private outputs given the one-way margins as the invariant. Additionally, we provide a privacy analysis of the 2020 US Decennial Census using the Semi-DP framework, revealing that the effective privacy guarantees are weaker than advertised.

Formal Privacy Guarantees with Invariant Statistics

TL;DR

This paper's framework, Semi-DP, redefines adjacency by focusing on datasets that conform to the given invariant, ensuring indistinguishability between adjacent datasets within invariant-conforming datasets, and develops customized mechanisms that satisfy Semi-DP.

Abstract

Motivated by the 2020 US Census products, this paper extends differential privacy (DP) to address the joint release of DP outputs and nonprivate statistics, referred to as invariant. Our framework, Semi-DP, redefines adjacency by focusing on datasets that conform to the given invariant, ensuring indistinguishability between adjacent datasets within invariant-conforming datasets. We further develop customized mechanisms that satisfy Semi-DP, including the Gaussian mechanism and the optimal -norm mechanism for rank-deficient sensitivity spaces. Our framework is applied to contingency table analysis which is relevant to the 2020 US Census, illustrating how Semi-DP enables the release of private outputs given the one-way margins as the invariant. Additionally, we provide a privacy analysis of the 2020 US Decennial Census using the Semi-DP framework, revealing that the effective privacy guarantees are weaker than advertised.

Paper Structure

This paper contains 37 sections, 28 theorems, 57 equations, 3 figures, 6 algorithms.

Key Result

Proposition 6

(Gaussian Mechanism:dong2022gaussian). Define the Gaussian mechanism that operates on a query $\phi$ as $M(X) = \phi(X) + \xi$, where $\xi \sim N(0, \sigma I_{d})$ for some $\sigma \geq \mu^{-1}\Delta_{2}(\phi;A)$. Then $M$ is $(\mathcal{D},A,G_{\mu})$-DP.

Figures (3)

  • Figure 1: A $7 \times 6$ contingency table illustrating the sensitivity space element $\mathbf{v}_{ijlk}$ for the indices $i = 2$, $j = 2$, $k = 5$, and $l = 5$. The table shows the placement of $+1$ values at positions $(i, j) = (2, 2)$ and $(l, k) = (5, 5)$, as well as $-1$ values at positions $(i, k) = (2, 5)$ and $(l, j) = (5, 2)$, while all other cells contain zeros. It represents a typical element of the sensitivity space $S_{Semi}$ that preserves the one-way margins.
  • Figure 2: Comparison of average $L_2$-costs between our mechanism and the naive mechanism across varying contingency table sizes $k \in \{2,3,\dots,10\}$. Model I assumes uniform cell probabilities, while Model II has linearly increasing cell probabilities.
  • Figure 3: Comparison of $L_2$-costs for the $K$-norm mechanism against naive $\ell_{1}$, $\ell_{2}$, and $\ell_{\infty}$-norm mechanisms across two models for contingency tables of size $k = 2$ and $k = 3$. The results are shown for three privacy parameters $\epsilon = 0.1, 0.5,$ and $1$.

Theorems & Definitions (55)

  • Definition 1
  • Example 1
  • Remark 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Proposition 6
  • Proposition 7
  • Definition 8
  • Proposition 9
  • ...and 45 more