Table of Contents
Fetching ...

Profile Reconstruction from Private Sketches

Hao Wu, Rasmus Pagh

TL;DR

An information-theoretic lower bound is given showing that this dependence on d is asymptotically optimal among all private, updatable sketches for the profile reconstruction problem with a high-probability error guarantee.

Abstract

Given a multiset of $n$ items from $\mathcal{D}$, the \emph{profile reconstruction} problem is to estimate, for $t = 0, 1, \dots, n$, the fraction $\vec{f}[t]$ of items in $\mathcal{D}$ that appear exactly $t$ times. We consider differentially private profile estimation in a distributed, space-constrained setting where we wish to maintain an updatable, private sketch of the multiset that allows us to compute an approximation of $\vec{f} = (\vec{f}[0], \dots, \vec{f}[n])$. Using a histogram privatized using discrete Laplace noise, we show how to ``reverse'' the noise, using an approach of Dwork et al.~(ITCS '10). We show how to speed up their LP-based technique from polynomial time to $O(d + n \log n)$, where $d = |\mathcal{D}|$, and analyze the achievable error in the $\ell_1$, $\ell_2$ and $\ell_\infty$ norms. In all cases the dependency of the error on $d$ is $O( 1 / \sqrt{d})$ -- we give an information-theoretic lower bound showing that this dependence on $d$ is asymptotically optimal among all private, updatable sketches for the profile reconstruction problem with a high-probability error guarantee.

Profile Reconstruction from Private Sketches

TL;DR

An information-theoretic lower bound is given showing that this dependence on d is asymptotically optimal among all private, updatable sketches for the profile reconstruction problem with a high-probability error guarantee.

Abstract

Given a multiset of items from , the \emph{profile reconstruction} problem is to estimate, for , the fraction of items in that appear exactly times. We consider differentially private profile estimation in a distributed, space-constrained setting where we wish to maintain an updatable, private sketch of the multiset that allows us to compute an approximation of . Using a histogram privatized using discrete Laplace noise, we show how to ``reverse'' the noise, using an approach of Dwork et al.~(ITCS '10). We show how to speed up their LP-based technique from polynomial time to , where , and analyze the achievable error in the , and norms. In all cases the dependency of the error on is -- we give an information-theoretic lower bound showing that this dependence on is asymptotically optimal among all private, updatable sketches for the profile reconstruction problem with a high-probability error guarantee.
Paper Structure (28 sections, 15 theorems, 104 equations, 5 algorithms)

This paper contains 28 sections, 15 theorems, 104 equations, 5 algorithms.

Key Result

Theorem 1.1

Let $\eta \in (0, 1)$, $\varepsilon > 0$. Denote $B \doteq \frac{1}{\varepsilon} \ln ( \max \left\{ { \frac{2d}{\eta(e^\varepsilon + 1)}, \frac{8 \, e^\varepsilon}{e^{2\varepsilon} - 1} } \right\} )$, and assume that $n \ge B$. There is an algorithm that, given a private version $\tilde{h}$ of a $d$ where $\tilde{f} [t] = \frac{1}{d} | \{\ell \in \mathcal{D} : \tilde{h}[\ell] = t \} |,

Theorems & Definitions (36)

  • Theorem 1.1
  • Theorem 1.2
  • Definition 2.1: ${\varepsilon}$-Differentially Privacy DR14
  • Definition 3.1: Circulant Matrix
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Theorem 4.5
  • Lemma 4.6
  • ...and 26 more