Table of Contents
Fetching ...

On Privately Estimating a Single Parameter

Hilal Asi, John C. Duchi, Kunal Talwar

TL;DR

This work develops principled, differentially private methods for estimating a single parameter within a larger parametric model, leveraging local stability notions and private certificates to achieve instance-optimal accuracy. It introduces two-mode release strategies: privately certifying a lower bound on the Hessian's minimum eigenvalue and then releasing the parameter with noise scaled to a local modulus of continuity, and privately releasing linear functionals via test-and-release on stability ratios. The framework covers general smooth M-estimation and quasi-self-concordant GLMs, providing rigorous stability and eigenvalue bounds, recursive private bounds for eigenvalues, and algorithms for releasing both full parameter vectors and individual functionals. Comprehensive experiments on synthetic robust regression and Folktables (US Census) data illustrate practical performance and the transition where private releases approach non-private accuracy, while discussions illuminate dimension-dependent challenges and future directions. Overall, the paper contributes a cohesivePrivate-estimation toolkit that adapts to local problem geometry to achieve near-optimal private estimates of targeted parameters in high-dimensional settings.

Abstract

We investigate differentially private estimators for individual parameters within larger parametric models. While generic private estimators exist, the estimators we provide repose on new local notions of estimand stability, and these notions allow procedures that provide private certificates of their own stability. By leveraging these private certificates, we provide computationally and statistical efficient mechanisms that release private statistics that are, at least asymptotically in the sample size, essentially unimprovable: they achieve instance optimal bounds. Additionally, we investigate the practicality of the algorithms both in simulated data and in real-world data from the American Community Survey and US Census, highlighting scenarios in which the new procedures are successful and identifying areas for future work.

On Privately Estimating a Single Parameter

TL;DR

This work develops principled, differentially private methods for estimating a single parameter within a larger parametric model, leveraging local stability notions and private certificates to achieve instance-optimal accuracy. It introduces two-mode release strategies: privately certifying a lower bound on the Hessian's minimum eigenvalue and then releasing the parameter with noise scaled to a local modulus of continuity, and privately releasing linear functionals via test-and-release on stability ratios. The framework covers general smooth M-estimation and quasi-self-concordant GLMs, providing rigorous stability and eigenvalue bounds, recursive private bounds for eigenvalues, and algorithms for releasing both full parameter vectors and individual functionals. Comprehensive experiments on synthetic robust regression and Folktables (US Census) data illustrate practical performance and the transition where private releases approach non-private accuracy, while discussions illuminate dimension-dependent challenges and future directions. Overall, the paper contributes a cohesivePrivate-estimation toolkit that adapts to local problem geometry to achieve near-optimal private estimates of targeted parameters in high-dimensional settings.

Abstract

We investigate differentially private estimators for individual parameters within larger parametric models. While generic private estimators exist, the estimators we provide repose on new local notions of estimand stability, and these notions allow procedures that provide private certificates of their own stability. By leveraging these private certificates, we provide computationally and statistical efficient mechanisms that release private statistics that are, at least asymptotically in the sample size, essentially unimprovable: they achieve instance optimal bounds. Additionally, we investigate the practicality of the algorithms both in simulated data and in real-world data from the American Community Survey and US Census, highlighting scenarios in which the new procedures are successful and identifying areas for future work.

Paper Structure

This paper contains 60 sections, 38 theorems, 249 equations, 7 figures.

Key Result

Lemma 2.1

The following properties hold.

Figures (7)

  • Figure 1:
  • Figure 2: Error $|\theta_1^\star - \widehat{\theta}_1(P_n)|$ in the first-coordinate of the target $\theta(P_n)$ versus sample size for varying dimensions $d = 5, 10, 20$ in a robust regression experiment, where $\left\|{\theta^\star}\right\|_2 = 1$. The method Local (orange triangle) is Algorithm \ref{['alg:release-u-t-theta']}; SGD is DPSGD, non-private is the non-private idealized version of the methods here (item \ref{['item:non-private-local']}), objective is objective perturbation \ref{['eqn:obj-pert']}, and naive is the naive output perturbation estimator. Objective perturbation and the methods here exhibit the best performance, with Alg. \ref{['alg:release-u-t-theta']} exhibiting a noticeable improvement at sufficiently large sample size.
  • Figure 3: Identical to Fig. \ref{['fig:sample-size-scaling']}, except that $\left\|{\theta^\star}\right\|_2 = 5$. Note that the gap in performance between objective perturbation and Alg. \ref{['alg:release-u-t-theta']} is larger than in the case that $\left\|{\theta^\star}\right\|_2 = 1$.
  • Figure 4: Error $|\widehat{\theta}_j - \theta_j^\star|$ as a function of the privacy parameter $\varepsilon$ for simulated robust regression estimating a single (random) coordinate $j$. (a) Small norm $\left\|{\theta^\star}\right\|_2 = 1$ (b) Larger norm $\left\|{\theta^\star}\right\|_2 = 6$. Dimension $d = 10$ in both and sample size $n = 10^5$.
  • Figure 5: Error $\|{\theta - \theta^\star}\|_2$ as a function of the privacy parameter $\varepsilon$ for simulated robust regression estimating entire parameter $\theta^\star \in \mathbb{R}^d$, where $d = 10$, and sample size $n = 10^5$. (a) Small norm $\left\|{\theta^\star}\right\|_2 = 1$ (b) Larger norm $\left\|{\theta^\star}\right\|_2 = 6$.
  • ...and 2 more figures

Theorems & Definitions (39)

  • Definition 1.1
  • Lemma 2.1: Self-concordance properties
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 2.5
  • Corollary 3.1
  • Corollary 3.2
  • Theorem 1
  • Corollary 3.3
  • Corollary 3.4
  • ...and 29 more