Table of Contents
Fetching ...

Optimal Pure Differentially Private Sparse Histograms in Deterministic Linear Time

Florian Kerschbaum, Steven Lee, Hao Wu

TL;DR

This work tackles the problem of releasing a pure differentially private sparse histogram when the domain size $d$ far exceeds the number of participants $n$. It introduces a deterministic linear-time algorithm in the Word-RAM model, achieving the optimal $\ell_\infty$ error and supporting efficient circuit/MPC implementations via a novel private item blanket with target-length padding. The approach hinges on a time-oblivious, purified noise sampler for the discrete Laplace distribution and a careful padding scheme that preserves privacy while keeping the output sparse. By bridging central and distributed DP utilities, the paper closes the utility gap for histograms under pure DP and provides a practical path to secure MPC deployments with near-linear computation and communication costs.

Abstract

We present an algorithm that releases a pure differentially private (under the replacement neighboring relation) sparse histogram for $n$ participants over a domain of size $d \gg n$. Our method achieves the optimal $\ell_\infty$-estimation error and runs in strictly $O(n)$ time in the Word-RAM model, improving upon the previous best deterministic-time bound of $\tilde{O}(n^2)$ and resolving the open problem of breaking this quadratic barrier (Balcer and Vadhan, 2019). Moreover, the algorithm admits an efficient circuit implementation, enabling the first near-linear communication and computation cost pure DP histogram MPC protocol with optimal $\ell_\infty$-estimation error. Central to our algorithm is a novel **private item blanket** technique with target-length padding, which hides differences in the supports of neighboring histograms while remaining efficiently implementable.

Optimal Pure Differentially Private Sparse Histograms in Deterministic Linear Time

TL;DR

This work tackles the problem of releasing a pure differentially private sparse histogram when the domain size far exceeds the number of participants . It introduces a deterministic linear-time algorithm in the Word-RAM model, achieving the optimal error and supporting efficient circuit/MPC implementations via a novel private item blanket with target-length padding. The approach hinges on a time-oblivious, purified noise sampler for the discrete Laplace distribution and a careful padding scheme that preserves privacy while keeping the output sparse. By bridging central and distributed DP utilities, the paper closes the utility gap for histograms under pure DP and provides a practical path to secure MPC deployments with near-linear computation and communication costs.

Abstract

We present an algorithm that releases a pure differentially private (under the replacement neighboring relation) sparse histogram for participants over a domain of size . Our method achieves the optimal -estimation error and runs in strictly time in the Word-RAM model, improving upon the previous best deterministic-time bound of and resolving the open problem of breaking this quadratic barrier (Balcer and Vadhan, 2019). Moreover, the algorithm admits an efficient circuit implementation, enabling the first near-linear communication and computation cost pure DP histogram MPC protocol with optimal -estimation error. Central to our algorithm is a novel **private item blanket** technique with target-length padding, which hides differences in the supports of neighboring histograms while remaining efficiently implementable.

Paper Structure

This paper contains 36 sections, 23 theorems, 45 equations, 3 tables, 8 algorithms.

Key Result

Theorem 1.1

Let $n, d \in \mathbb{N}_+$ and $\varepsilon \in \mathbb{Q}_+$, with representations that fit in a constant number of machine words. Given a dataset $\mathcal{X}$ of $n$ user-contributed elements from domain $[d]$, there exists an $\varepsilon$-differentially private algorithm that runs deterministi

Theorems & Definitions (34)

  • Theorem 1.1: Private Sparse Histogram, Informal version of \ref{['thm:private-sparse-histogram-formal']}
  • Theorem 1.2: Private Sparse Histogram Circuit, Informal version of \ref{['thm:purified-approximate-discrete-laplace-sampler-formal']}
  • Theorem 1.3: Deterministic Approximate Sampler, Informal \ref{['thm:time-oblivious-distribution-sampler']}
  • Theorem 1.4: Relaxed Discrete Laplace, Informal version of \ref{['thm:purified-approximate-discrete-laplace-sampler']}
  • Definition 2.1: ${( {\varepsilon, \delta} )}$-Indistinguishability
  • Definition 2.2: ${( {\varepsilon, \delta} )}$-Private Algorithm DR14
  • Definition 2.3: $(\alpha, \beta)$-Simultaneous Accurate Estimator
  • Definition 3.1: Geometric Distribution
  • Definition 3.2: Discrete Laplace Distribution
  • Proposition 3.1: BalcerV19
  • ...and 24 more