Table of Contents
Fetching ...

Towards multi-purpose locally differentially-private synthetic data release via spline wavelet plug-in estimation

Thibault Randrianarisoa, Lukas Steinberger, Botond Szabó

TL;DR

This work tackles the challenge of locally differentially private synthetic data release that remains useful for a wide range of downstream inferences. It develops a spline-wavelet plug-in framework that first privately estimates the density and then computes arbitrary functionals $\Lambda(f)$ via plug-in, achieving minimax rates for both atomic and smooth functionals; adaptation is achieved through a Lepski-type procedure that does not require prior knowledge of the unknown smoothness. The authors provide private minimax lower bounds and show that the spline-wavelet estimators attain these rates, with an adaptive scheme ensuring optimal performance across smoothness classes. The approach enables multi-purpose private data release by storing and releasing sanitized spline-wavelet coefficients, allowing many analysts to perform diverse analyses under a single, principled privacy budget with provable guarantees and efficiency gains.

Abstract

We develop plug-in estimators for locally differentially private semi-parametric estimation via spline wavelets. The approach leads to optimal rates of convergence for a large class of estimation problems that are characterized by (differentiable) functionals $Λ(f)$ of the true data generating density $f$. The crucial feature of the locally private data $Z_1,\dots, Z_n$ we generate is that it does not depend on the particular functional $Λ$ (or the unknown density $f$) the analyst wants to estimate. Hence, the synthetic data can be generated and stored a priori and can subsequently be used by any number of analysts to estimate many vastly different functionals of interest at the provably optimal rate. In principle, this removes a long standing practical limitation in statistics of differential privacy, namely, that optimal privacy mechanisms need to be tailored towards the specific estimation problem at hand.

Towards multi-purpose locally differentially-private synthetic data release via spline wavelet plug-in estimation

TL;DR

This work tackles the challenge of locally differentially private synthetic data release that remains useful for a wide range of downstream inferences. It develops a spline-wavelet plug-in framework that first privately estimates the density and then computes arbitrary functionals via plug-in, achieving minimax rates for both atomic and smooth functionals; adaptation is achieved through a Lepski-type procedure that does not require prior knowledge of the unknown smoothness. The authors provide private minimax lower bounds and show that the spline-wavelet estimators attain these rates, with an adaptive scheme ensuring optimal performance across smoothness classes. The approach enables multi-purpose private data release by storing and releasing sanitized spline-wavelet coefficients, allowing many analysts to perform diverse analyses under a single, principled privacy budget with provable guarantees and efficiency gains.

Abstract

We develop plug-in estimators for locally differentially private semi-parametric estimation via spline wavelets. The approach leads to optimal rates of convergence for a large class of estimation problems that are characterized by (differentiable) functionals of the true data generating density . The crucial feature of the locally private data we generate is that it does not depend on the particular functional (or the unknown density ) the analyst wants to estimate. Hence, the synthetic data can be generated and stored a priori and can subsequently be used by any number of analysts to estimate many vastly different functionals of interest at the provably optimal rate. In principle, this removes a long standing practical limitation in statistics of differential privacy, namely, that optimal privacy mechanisms need to be tailored towards the specific estimation problem at hand.

Paper Structure

This paper contains 32 sections, 17 theorems, 243 equations.

Key Result

Theorem 3.1

Fix $n,p\in{\mathbb N}$, $\alpha>0$ and a convex set $\mathcal{W}\subseteq \mathcal{W}_p$. Let $\Lambda:\mathcal{W} \to \mathbb R$ be differentiable as in eq: form functional. If $\Lambda$ is not constant on $\mathcal{W}$, then there exist constants $c,C>0$ depending only on $\Lambda$ and $\mathcal{ provided that $n(e^\alpha-1)^2\ge C$. Here, the second infimum is over all estimators $\hat{\Lambda

Theorems & Definitions (28)

  • Theorem 3.1
  • Definition 1
  • Theorem 3.2
  • Theorem 4.1
  • proof
  • Definition 2
  • Theorem 4.2
  • Theorem 4.3
  • Proposition 5.1
  • proof
  • ...and 18 more