Table of Contents
Fetching ...

Better Gaussian Mechanism using Correlated Noise

Christian Janos Lebeda

TL;DR

This work introduces a simple yet effective variant of the Gaussian mechanism for privately releasing $d$ counting queries under the add/remove neighborhood. By adding a common Gaussian noise to all counts plus independent per-coordinate noise, the mechanism reduces per-query noise from $\sqrt{d}/\mu$ to $(\sqrt{d}+1)/(2\mu)$ while preserving $\mu$-GDP privacy, with a total per-query error variance of $\frac{d+2\sqrt{d}+1}{4\mu^2}$. The authors provide multiple representations of the mechanism, including an injective lift to $\mathbb{R}^{d+1}$ and a corresponding post-processing equivalence to the standard Gaussian mechanism on a transformed dataset, as well as an analysis showing near-optimality relative to specialized ellipse-based approaches for large $d$ and high sparsity. They further extend the approach to estimate dataset size more accurately, analyze a bounded-density Count setting, and generalize to grouped-query scenarios, suggesting broad applicability beyond counting queries. The results offer a practical, implementable path to improved utility in DP counting tasks and potential extensions to other DP mechanisms and data-analysis settings.

Abstract

We present a simple variant of the Gaussian mechanism for answering differentially private queries when the sensitivity space has a certain common structure. Our motivating problem is the fundamental task of answering $d$ counting queries under the add/remove neighboring relation. The standard Gaussian mechanism solves this task by adding noise distributed as a Gaussian with variance scaled by $d$ independently to each count. We show that adding a random variable distributed as a Gaussian with variance scaled by $(\sqrt{d} + 1)/4$ to all counts allows us to reduce the variance of the independent Gaussian noise samples to scale only with $(d + \sqrt{d})/4$. The total noise added to each counting query follows a Gaussian distribution with standard deviation scaled by $(\sqrt{d} + 1)/2$ rather than $\sqrt{d}$. The central idea of our mechanism is simple and the technique is flexible. We show that applying our technique to another problem gives similar improvements over the standard Gaussian mechanism.

Better Gaussian Mechanism using Correlated Noise

TL;DR

This work introduces a simple yet effective variant of the Gaussian mechanism for privately releasing counting queries under the add/remove neighborhood. By adding a common Gaussian noise to all counts plus independent per-coordinate noise, the mechanism reduces per-query noise from to while preserving -GDP privacy, with a total per-query error variance of . The authors provide multiple representations of the mechanism, including an injective lift to and a corresponding post-processing equivalence to the standard Gaussian mechanism on a transformed dataset, as well as an analysis showing near-optimality relative to specialized ellipse-based approaches for large and high sparsity. They further extend the approach to estimate dataset size more accurately, analyze a bounded-density Count setting, and generalize to grouped-query scenarios, suggesting broad applicability beyond counting queries. The results offer a practical, implementable path to improved utility in DP counting tasks and potential extensions to other DP mechanisms and data-analysis settings.

Abstract

We present a simple variant of the Gaussian mechanism for answering differentially private queries when the sensitivity space has a certain common structure. Our motivating problem is the fundamental task of answering counting queries under the add/remove neighboring relation. The standard Gaussian mechanism solves this task by adding noise distributed as a Gaussian with variance scaled by independently to each count. We show that adding a random variable distributed as a Gaussian with variance scaled by to all counts allows us to reduce the variance of the independent Gaussian noise samples to scale only with . The total noise added to each counting query follows a Gaussian distribution with standard deviation scaled by rather than . The central idea of our mechanism is simple and the technique is flexible. We show that applying our technique to another problem gives similar improvements over the standard Gaussian mechanism.
Paper Structure (8 sections, 14 theorems, 7 equations, 5 figures)

This paper contains 8 sections, 14 theorems, 7 equations, 5 figures.

Key Result

Lemma 2.1

Let $f \colon \mathcal{U}^\mathbb{N} \rightarrow \mathbb{R}^d$ be a function with $\ell_2$ sensitivity $\Delta f$. Then the mechanism that outputs $f(X) + Z$ where $Z \sim \mathcal{N}(0, (\Delta f)^2/\mu^2 I_d)$ satisfies $\mu$-GDP.

Figures (5)

  • Figure 1: Gaussian Mechanism using Correlated Noise
  • Figure 2: 2D geometric intuition of our technique. Each data point is in $[0,1] \times [0,1]$. The red cross marks $f(X)$ in all figures. The two left figures depict the shape of noise for the standard Gaussian mechanism. Informally, the mechanism is private if $f(X')$ always is inside the blue circle with radius $\sqrt{2}$. The value of $f(X')$ under the replacement neighboring relation can be anywhere in the grey box in the first figure. That is, the box shows the result of adding any point from the sensitivity space to $f(X)$. The boxes in the second figure similarly depict the sensitivity space under the add/remove neighboring relation. In general, there are $2^d$ possible values for $f(X')$ that touch the hypersphere under replacement, but only $2$ such values under add/remove for any $d$. In the third figure we focus on the case where we add a data point to $X$ to construct $X'$. Notice that the orange circle with radius $\sqrt{2}/2$ centered at $f(X) + (0.5, 0.5)$ contains the box. We spend part of the privacy budget to add noise along the yellow diagonal. The resulting noise is elliptical and contains the sensitivity space under add/remove as seen in the last figure.
  • Figure 3: The plot shows the effect of changing the parameter $C$ for datasets with $d = 10000$. On the right, we show the covariance matrix similar to Corollary \ref{['cor:covariance-matrix']}. All entries in the matrix should be scaled by $1/\mu^2$. For simplicity we consider $\mu = 1$. We have that $A = d/C^2 + 1 = 10^4/C^2 + 1$ and $B = d + C^2 = 10^4 + C^2$. The plot depicts the value of $(A + B)/4$ as a function of $C$.
  • Figure 4: Correlated Gaussian Mechanism for disjoint queries
  • Figure 5: Comparison of error for the bounded Count problem for $d=1000$. Note that the optimal Count ellipse by constructions-k-norm-elliptic requires $k \leq d/2$. The left plot corresponds to the right plot in constructions-k-norm-elliptic where we included our mechanism for $k > d/2$. The line for the $\ell_2$ Ball represents the standard Gaussian mechanism in this setting.

Theorems & Definitions (19)

  • Definition 2.1: Neighboring datasets (denoted $X \sim X'$)
  • Definition 2.2: Gaussian differential privacy gaussianDPand-f-DP
  • Definition 2.3: Trade-off function
  • Definition 2.4: $\ell_2$ sensitivity
  • Definition 2.5: Sensitivity space
  • Lemma 2.1: The Gaussian mechanism gaussianDPand-f-DP
  • Lemma 2.2: Post-processing gaussianDPand-f-DP
  • Lemma 3.1
  • Lemma 3.2
  • Lemma 3.3
  • ...and 9 more