Table of Contents
Fetching ...

Debiasing Functions of Private Statistics in Postprocessing

Flavio Calmon, Elbert Du, Cynthia Dwork, Brian Finley, Grigory Franguridi

TL;DR

This work develops a deconvolution-based framework to obtain unbiased post-processing estimators for functions of private statistics under differential privacy, primarily using Laplace noise. It provides closed-form unbiased estimators for a broad class of twice-differentiable tempered functions (notably including 1/q) and extends to non-tempered cases via domain bounds and polynomial extensions, enabling unbiased estimation even when n is unknown. The authors apply these results to private mean estimation, slowly scaling per-record DP, and polynomials under general noise distributions, demonstrating practical advantages and proposing avenues for multivariate and non-Laplace extensions. Overall, the paper advances unbiased post-processing techniques for DP, with concrete mechanisms and theoretical guarantees that improve utility in privacy-preserving data analysis.

Abstract

Given a differentially private unbiased estimate $\tilde{q}=q(D) +ν$ of a statistic $q(D)$, we wish to obtain unbiased estimates of functions of $q(D)$, such as $1/q(D)$, solely through post-processing of $\tilde{q}$, with no further access to the confidential dataset $D$. To this end, we adapt the deconvolution method used for unbiased estimation in the statistical literature, deriving unbiased estimators for a broad family of twice-differentiable functions when the privacy-preserving noise $ν$ is drawn from the Laplace distribution (Dwork et al., 2006). We further extend this technique to a more general class of functions, deriving approximately optimal estimators that are unbiased for values in a user-specified interval (possibly extending to $\pm \infty$). We use these results to derive an unbiased estimator for private means when the size $n$ of the dataset is not publicly known. In a numerical application, we find that a mechanism that uses our estimator to return an unbiased sample size and mean outperforms a mechanism that instead uses the previously known unbiased privacy mechanism for such means (Kamath et al., 2023). We also apply our estimators to develop unbiased transformation mechanisms for per-record differential privacy, a privacy concept in which the privacy guarantee is a public function of a record's value (Seeman et al., 2024). Our mechanisms provide stronger privacy guarantees than those in prior work (Finley et al., 2024) by using Laplace, rather than Gaussian, noise. Finally, using a different approach, we go beyond Laplace noise by deriving unbiased estimators for polynomials under the weak condition that the noise distribution has sufficiently many moments.

Debiasing Functions of Private Statistics in Postprocessing

TL;DR

This work develops a deconvolution-based framework to obtain unbiased post-processing estimators for functions of private statistics under differential privacy, primarily using Laplace noise. It provides closed-form unbiased estimators for a broad class of twice-differentiable tempered functions (notably including 1/q) and extends to non-tempered cases via domain bounds and polynomial extensions, enabling unbiased estimation even when n is unknown. The authors apply these results to private mean estimation, slowly scaling per-record DP, and polynomials under general noise distributions, demonstrating practical advantages and proposing avenues for multivariate and non-Laplace extensions. Overall, the paper advances unbiased post-processing techniques for DP, with concrete mechanisms and theoretical guarantees that improve utility in privacy-preserving data analysis.

Abstract

Given a differentially private unbiased estimate of a statistic , we wish to obtain unbiased estimates of functions of , such as , solely through post-processing of , with no further access to the confidential dataset . To this end, we adapt the deconvolution method used for unbiased estimation in the statistical literature, deriving unbiased estimators for a broad family of twice-differentiable functions when the privacy-preserving noise is drawn from the Laplace distribution (Dwork et al., 2006). We further extend this technique to a more general class of functions, deriving approximately optimal estimators that are unbiased for values in a user-specified interval (possibly extending to ). We use these results to derive an unbiased estimator for private means when the size of the dataset is not publicly known. In a numerical application, we find that a mechanism that uses our estimator to return an unbiased sample size and mean outperforms a mechanism that instead uses the previously known unbiased privacy mechanism for such means (Kamath et al., 2023). We also apply our estimators to develop unbiased transformation mechanisms for per-record differential privacy, a privacy concept in which the privacy guarantee is a public function of a record's value (Seeman et al., 2024). Our mechanisms provide stronger privacy guarantees than those in prior work (Finley et al., 2024) by using Laplace, rather than Gaussian, noise. Finally, using a different approach, we go beyond Laplace noise by deriving unbiased estimators for polynomials under the weak condition that the noise distribution has sufficiently many moments.

Paper Structure

This paper contains 20 sections, 21 theorems, 117 equations, 3 figures, 3 algorithms.

Key Result

Theorem 8

(vanDijk+2013 section 7.1 property c)

Figures (3)

  • Figure 1: Standard deviations of the mechanisms $M_{SS}$ and $M_U$ for a mean of $n$ records in [0,1]. The mean is fixed at .5 and the mechanisms have a privacy budget of $\epsilon_2 = .5$.
  • Figure 2: Same as Figure \ref{['fig:MnAppSDsBig']}, zoomed in to larger values of $n$ (see the horizontal axis endpoints). Standard deviations of the mechanisms $M_{SS}$ and $M_U$ for a mean of $n$ records in [0,1]. The mean is fixed at .5 and the mechanisms have a privacy budget of $\epsilon_2 = .5$.
  • Figure 3: Standard deviation of $M_{SS}$ divided by the standard deviation of $M_U$ for a mean of $n$ records in [0,1]. The mean is fixed at .5 and the mechanisms have a privacy budget of $\epsilon_2 = .5$.

Theorems & Definitions (52)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7
  • Theorem 8
  • Theorem 9
  • Theorem 10
  • ...and 42 more