Table of Contents
Fetching ...

Differentially Private Range Queries with Correlated Input Perturbation

Prathamesh Dharangutte, Jie Gao, Ruobin Gong, Guanyang Wang

Abstract

This work proposes a class of differentially private mechanisms for linear queries, in particular range queries, that leverages correlated input perturbation to simultaneously achieve unbiasedness, consistency, statistical transparency, and control over utility requirements in terms of accuracy targets expressed either in certain query margins or as implied by the hierarchical database structure. The proposed Cascade Sampling algorithm instantiates the mechanism exactly and efficiently. Our theoretical and empirical analysis demonstrates that we achieve near-optimal utility, effectively compete with other methods, and retain all the favorable statistical properties discussed earlier.

Differentially Private Range Queries with Correlated Input Perturbation

Abstract

This work proposes a class of differentially private mechanisms for linear queries, in particular range queries, that leverages correlated input perturbation to simultaneously achieve unbiasedness, consistency, statistical transparency, and control over utility requirements in terms of accuracy targets expressed either in certain query margins or as implied by the hierarchical database structure. The proposed Cascade Sampling algorithm instantiates the mechanism exactly and efficiently. Our theoretical and empirical analysis demonstrates that we achieve near-optimal utility, effectively compete with other methods, and retain all the favorable statistical properties discussed earlier.
Paper Structure (32 sections, 16 theorems, 57 equations, 8 figures)

This paper contains 32 sections, 16 theorems, 57 equations, 8 figures.

Key Result

Theorem 3.3

Consider $n = 2^k$ data points identified by their binary representation as described earlier. Assuming the noise mechanism is defined as per Definition def:correlated-noise-mechanism, every node in the binary tree, including both leaf and internal nodes, has a marginal distribution of $\mathbb{N}(0

Figures (8)

  • Figure 1: Left: Illustration of noise allocation to sibling nodes and their parent within a binary tree. Here $X_\star$ denotes the noise applied to a node labeled $\star$, and $Y_\star$ is another standard Gaussian independent of $X_\star$. The noise values for the children nodes are $X_{\star 0} := X_{\star}/2 + \sqrt{3} Y_\star/2$ and $X_{\star 1} := X_{\star}/2 - \sqrt{3} Y_\star/2$. Right: Noise allocation on the top two levels of a binary tree.
  • Figure 2: (a): Comparison of Gaussian Generation Speeds (Left: original scale; Right:log-log scale). (b) & (c): $\mathsf{err}_{\boldsymbol{W},2}({\mathcal{W}}_\sigma)$ and $\mathsf{err}_{\boldsymbol{W},\infty}({\mathcal{W}}_\sigma)$ for continuous queries in log-log scale. (d) & (e): Variance of queries for levels of binary tree in log-log scale.
  • Figure 3: Cascade Sampling Algorithm for Streaming Data
  • Figure 4: Noise Allocation Mechanism for a General Binary Tree
  • Figure 5: Noise Allocation Mechanism for two-dimensional range queries
  • ...and 3 more figures

Theorems & Definitions (41)

  • Definition 2.1
  • Definition 2.2: gong2022transparent, Def. 3
  • Definition 2.3
  • Remark 2.4
  • Definition 3.1
  • Definition 3.2
  • Theorem 3.3
  • Proposition 3.4
  • Proposition 3.5
  • Remark 3.6
  • ...and 31 more