Table of Contents
Fetching ...

Efficiency in local differential privacy

Lukas Steinberger

TL;DR

An algorithm for finding a (nearly) optimal privacy mechanism $\hat{Q}$ and an estimator based on the corresponding sanitized data that achieves this asymptotically optimal variance is presented.

Abstract

We develop a theory of asymptotic efficiency in regular parametric models when data confidentiality is ensured by local differential privacy (LDP). Even though efficient parameter estimation is a classical and well-studied problem in mathematical statistics, it leads to several non-trivial obstacles that need to be tackled when dealing with the LDP case. Starting from a standard parametric model $\mathcal P=(P_θ)_{θ\inΘ}$, $Θ\subseteq\mathbb R^p$, for the iid unobserved sensitive data $X_1,\dots, X_n$, we establish local asymptotic mixed normality (along subsequences) of the model $$Q^{(n)}\mathcal P=(Q^{(n)}P_θ^n)_{θ\inΘ}$$ generating the sanitized observations $Z_1,\dots, Z_n$, where $Q^{(n)}$ is an arbitrary sequence of sequentially interactive privacy mechanisms. This result readily implies convolution and local asymptotic minimax theorems. In case $p=1$, the optimal asymptotic variance is found to be the inverse of the supremal Fisher-Information $\sup_{Q\in\mathcal Q_α} I_θ(Q\mathcal P)\in\mathbb R$, where the supremum runs over all $α$-differentially private (marginal) Markov kernels. We present an algorithm for finding a (nearly) optimal privacy mechanism $\hat{Q}$ and an estimator $\hatθ_n(Z_1,\dots, Z_n)$ based on the corresponding sanitized data that achieves this asymptotically optimal variance.

Efficiency in local differential privacy

TL;DR

An algorithm for finding a (nearly) optimal privacy mechanism and an estimator based on the corresponding sanitized data that achieves this asymptotically optimal variance is presented.

Abstract

We develop a theory of asymptotic efficiency in regular parametric models when data confidentiality is ensured by local differential privacy (LDP). Even though efficient parameter estimation is a classical and well-studied problem in mathematical statistics, it leads to several non-trivial obstacles that need to be tackled when dealing with the LDP case. Starting from a standard parametric model , , for the iid unobserved sensitive data , we establish local asymptotic mixed normality (along subsequences) of the model generating the sanitized observations , where is an arbitrary sequence of sequentially interactive privacy mechanisms. This result readily implies convolution and local asymptotic minimax theorems. In case , the optimal asymptotic variance is found to be the inverse of the supremal Fisher-Information , where the supremum runs over all -differentially private (marginal) Markov kernels. We present an algorithm for finding a (nearly) optimal privacy mechanism and an estimator based on the corresponding sanitized data that achieves this asymptotically optimal variance.
Paper Structure (38 sections, 28 theorems, 175 equations, 5 figures)

This paper contains 38 sections, 28 theorems, 175 equations, 5 figures.

Key Result

Lemma 3.1

Fix $\alpha\in(0,\infty)$. Suppose the model ${\mathcal{P}}=(P_\theta)_{\theta\in\Theta}$ is DQM at $\theta\in\Theta\subseteq{\mathbb R}^p$, with score function $s_\theta$, and $Q\in\mathcal{Q}_\alpha(\mathcal{X}\to\mathcal{Z})$ is a privacy mechanism. Then the model $Q{\mathcal{P}}=(QP_\theta)_{\th that is, a regular conditional expectation of $s_\theta(X)$ given $Z=z$, where the joint distributi

Figures (5)

  • Figure 1: Graphical representation of the efficient two-step $\alpha$-sequentially interactive privacy mechanism and estimation procedure.
  • Figure 2: Optimal values of \ref{['eq:maxGauss']} for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$. For $k\to\infty$ theory predicts that these values converge to the global optimum $\sup_{Q\in\mathcal{Q}_\alpha(\mathcal{X})} I_0(Q{\mathcal{P}})$.
  • Figure 3: Number of non-zero rows (vertical axis), i.e, size of the output alphabet $\hat{\mathcal{Z}} \subseteq [k]$, of the optimal privacy mechanism $\hat{Q}\in\mathcal{Q}_\alpha([k]\to\hat{\mathcal{Z}})$ that maximizes \ref{['eq:maxGauss']}, for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$.
  • Figure 4: Optimal values of private Fisher-Information $Q\mapsto I_1(QT_{n_1,1}{\mathcal{P}}_{scale})$ in the Gaussian scale model for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$. Theory predicts that these values converge to the global optimum $\sup_{Q\in\mathcal{Q}_\alpha(\mathcal{X})} I_1(Q{\mathcal{P}}_{scale})$, for $k\to\infty$.
  • Figure 5: Number of non-zero rows, i.e, size of the output alphabet $\hat{\mathcal{Z}} \subseteq [k]$, of the optimal privacy mechanism $\hat{Q}\in\mathcal{Q}_\alpha([k]\to\hat{\mathcal{Z}})$ in the Gaussian scale model, for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$.

Theorems & Definitions (53)

  • Remark 2.1: On the $L^1$-information of Duchi23
  • Definition 1: Differentiability in Quadratic Mean
  • Lemma 3.1
  • proof
  • Definition 2: Local Asymptotic Mixed Normality
  • Lemma 3.2
  • proof
  • Theorem 3.3
  • Remark 3.4
  • Theorem 3.5
  • ...and 43 more