Table of Contents
Fetching ...

A Polynomial Time, Pure Differentially Private Estimator for Binary Product Distributions

Vikrant Singhal

TL;DR

The paper resolves a long-standing open problem by presenting a polynomial-time, ε-differentially private algorithm that accurately estimates the means of binary product distributions over {0,1}^d in total-variation distance with near-optimal sample complexity. The method employs private partitioning to isolate heavy coordinates, rescaling, and a private sub-Gaussian mean learner to obtain direction-wise accuracy, followed by a final one-shot phase for lighter coordinates. It achieves d/α^2 + d/(εα) sample complexity up to polylogarithmic factors, matching known lower bounds under pure DP, and demonstrates practical efficiency unattainable by prior pure-DP methods. The work highlights private preconditioning as a key technique for direction-wise DP estimation and suggests broader applicability to other distribution families and metrics beyond TV distance.

Abstract

We present the first $\varepsilon$-differentially private, computationally efficient algorithm that estimates the means of product distributions over $\{0,1\}^d$ accurately in total-variation distance, whilst attaining the optimal sample complexity to within polylogarithmic factors. The prior work had either solved this problem efficiently and optimally under weaker notions of privacy, or had solved it optimally while having exponential running times.

A Polynomial Time, Pure Differentially Private Estimator for Binary Product Distributions

TL;DR

The paper resolves a long-standing open problem by presenting a polynomial-time, ε-differentially private algorithm that accurately estimates the means of binary product distributions over {0,1}^d in total-variation distance with near-optimal sample complexity. The method employs private partitioning to isolate heavy coordinates, rescaling, and a private sub-Gaussian mean learner to obtain direction-wise accuracy, followed by a final one-shot phase for lighter coordinates. It achieves d/α^2 + d/(εα) sample complexity up to polylogarithmic factors, matching known lower bounds under pure DP, and demonstrates practical efficiency unattainable by prior pure-DP methods. The work highlights private preconditioning as a key technique for direction-wise DP estimation and suggests broader applicability to other distribution families and metrics beyond TV distance.

Abstract

We present the first -differentially private, computationally efficient algorithm that estimates the means of product distributions over accurately in total-variation distance, whilst attaining the optimal sample complexity to within polylogarithmic factors. The prior work had either solved this problem efficiently and optimally under weaker notions of privacy, or had solved it optimally while having exponential running times.
Paper Structure (22 sections, 17 theorems, 37 equations, 1 algorithm)

This paper contains 22 sections, 17 theorems, 37 equations, 1 algorithm.

Key Result

Theorem 1.1

For every $\varepsilon,\alpha,\beta > 0$, there exists a polynomial-time, $\varepsilon$-DP algorithm that takes $n$ i.i.d. samples from a product distribution $P$ over $\{0,1\}^d$, and returns a product distribution $Q$, such that if where $\widetilde{O}}(\cdot)$ hides all polylogarithmic factors, then with probability at least $1-\beta$, the total-variation distance between $P$ and $Q$ is at mos

Theorems & Definitions (46)

  • Theorem 1.1: Informal
  • Remark 1.2
  • Remark 1.3
  • Definition 2.1
  • Lemma 2.2: $\chi^2$-Divergence between Bernoulli Distributions
  • proof
  • Lemma 2.3: Sub-Additivity under Product Distributions
  • Lemma 2.4: Pinsker's Inequality
  • Lemma 2.5: Bernstein's Inequality
  • Lemma 2.6: Multiplicative Chernoff Bound
  • ...and 36 more