Table of Contents
Fetching ...

PREM: Privately Answering Statistical Queries with Relative Error

Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi, Sushant Sachdeva

TL;DR

PREM tackles privately answering statistical queries with a relative-error guarantee under DP by introducing a Private Relative Error MWU framework that outputs a synthetic histogram. The core idea combines RangeMonitor, a private threshold-tracking primitive, with a multiplicative-weights update to iteratively build a synthetic dataset that approximates counts up to a multiplicative factor $1\pm\zeta$ and a polylogarithmic additive error $\alpha$. For $(\varepsilon,\delta)$-DP, PREM achieves $\alpha = \tilde{O}\left( \frac{1}{\zeta\varepsilon} \left( \log n \log \frac{1}{\delta} \right)^{\frac{3}{2}} \sqrt{\log |X|} \log \left( \frac{|F|}{\beta} \right) \right)$, while for pure-DP it attains $\alpha = \tilde{O}\left( \sqrt{ \frac{n \log^3 n}{\zeta^2 \varepsilon} \log |X| \log |F| } + \frac{\log n}{\varepsilon} \log \frac{1}{\beta} \right)$. The work also derives near-matching lower bounds for approximate-DP and extends the framework to real-valued queries via a thresholding reduction, outlining open questions about tightening pure-DP gaps and specializing to particular query families. Overall, PREM demonstrates that relative-error privacy-preserving data synthesis can achieve polylogarithmic additive error in key parameters, offering a substantial improvement over additive-only DP mechanisms in many regimes.

Abstract

We introduce $\mathsf{PREM}$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a relative error guarantee for statistical queries under $(\varepsilon, δ)$ differential privacy (DP). Namely, for a domain ${\cal X}$, a family ${\cal F}$ of queries $f : {\cal X} \to \{0, 1\}$, and $ζ> 0$, our framework yields a mechanism that on input dataset $D \in {\cal X}^n$ outputs a synthetic dataset $\widehat{D} \in {\cal X}^n$ such that all statistical queries in ${\cal F}$ on $D$, namely $\sum_{x \in D} f(x)$ for $f \in {\cal F}$, are within a $1 \pm ζ$ multiplicative factor of the corresponding value on $\widehat{D}$ up to an additive error that is polynomial in $\log |{\cal F}|$, $\log |{\cal X}|$, $\log n$, $\log(1/δ)$, $1/\varepsilon$, and $1/ζ$. In contrast, any $(\varepsilon, δ)$-DP mechanism is known to require worst-case additive error that is polynomial in at least one of $n, |{\cal F}|$, or $|{\cal X}|$. We complement our algorithm with nearly matching lower bounds.

PREM: Privately Answering Statistical Queries with Relative Error

TL;DR

PREM tackles privately answering statistical queries with a relative-error guarantee under DP by introducing a Private Relative Error MWU framework that outputs a synthetic histogram. The core idea combines RangeMonitor, a private threshold-tracking primitive, with a multiplicative-weights update to iteratively build a synthetic dataset that approximates counts up to a multiplicative factor and a polylogarithmic additive error . For -DP, PREM achieves , while for pure-DP it attains . The work also derives near-matching lower bounds for approximate-DP and extends the framework to real-valued queries via a thresholding reduction, outlining open questions about tightening pure-DP gaps and specializing to particular query families. Overall, PREM demonstrates that relative-error privacy-preserving data synthesis can achieve polylogarithmic additive error in key parameters, offering a substantial improvement over additive-only DP mechanisms in many regimes.

Abstract

We introduce (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a relative error guarantee for statistical queries under differential privacy (DP). Namely, for a domain , a family of queries , and , our framework yields a mechanism that on input dataset outputs a synthetic dataset such that all statistical queries in on , namely for , are within a multiplicative factor of the corresponding value on up to an additive error that is polynomial in , , , , , and . In contrast, any -DP mechanism is known to require worst-case additive error that is polynomial in at least one of , or . We complement our algorithm with nearly matching lower bounds.

Paper Structure

This paper contains 27 sections, 20 theorems, 53 equations, 1 table, 5 algorithms.

Key Result

Proposition 2.2

For any $0< a\leq 1$, $\delta>0$ and integer $R > 0$, $\textsc{RangeMonitor}^{\mathrm{apx}}_{\bm h, a, \mathcal{Y}}$ (alg:outside-thresholds), after $R$ rounds of queries, satisfies $(\varepsilon, \delta)$-DP for $\varepsilon = O\left(a \log \frac{1}{\delta}\right)$. Let $\beta > 0$, $C := \frac{1}{

Theorems & Definitions (33)

  • Definition 2.1: $(\varepsilon, \delta)$-DP
  • Proposition 2.2: $\textsc{RangeMonitor}$ (Approx-DP) guarantees
  • Theorem 3.1
  • Lemma 3.2
  • proof
  • Claim 3.3
  • proof
  • Lemma 3.4
  • Theorem 3.5
  • Theorem 4.1
  • ...and 23 more