Efficiency in local differential privacy

Lukas Steinberger

Efficiency in local differential privacy

Lukas Steinberger

TL;DR

An algorithm for finding a (nearly) optimal privacy mechanism $\hat{Q}$ and an estimator based on the corresponding sanitized data that achieves this asymptotically optimal variance is presented.

Abstract

We develop a theory of asymptotic efficiency in regular parametric models when data confidentiality is ensured by local differential privacy (LDP). Even though efficient parameter estimation is a classical and well-studied problem in mathematical statistics, it leads to several non-trivial obstacles that need to be tackled when dealing with the LDP case. Starting from a standard parametric model $\mathcal P=(P_θ)_{θ\inΘ}$, $Θ\subseteq\mathbb R^p$, for the iid unobserved sensitive data $X_1,\dots, X_n$, we establish local asymptotic mixed normality (along subsequences) of the model $$Q^{(n)}\mathcal P=(Q^{(n)}P_θ^n)_{θ\inΘ}$$ generating the sanitized observations $Z_1,\dots, Z_n$, where $Q^{(n)}$ is an arbitrary sequence of sequentially interactive privacy mechanisms. This result readily implies convolution and local asymptotic minimax theorems. In case $p=1$, the optimal asymptotic variance is found to be the inverse of the supremal Fisher-Information $\sup_{Q\in\mathcal Q_α} I_θ(Q\mathcal P)\in\mathbb R$, where the supremum runs over all $α$-differentially private (marginal) Markov kernels. We present an algorithm for finding a (nearly) optimal privacy mechanism $\hat{Q}$ and an estimator $\hatθ_n(Z_1,\dots, Z_n)$ based on the corresponding sanitized data that achieves this asymptotically optimal variance.

Efficiency in local differential privacy

TL;DR

An algorithm for finding a (nearly) optimal privacy mechanism

and an estimator based on the corresponding sanitized data that achieves this asymptotically optimal variance is presented.

Abstract

, for the iid unobserved sensitive data

, we establish local asymptotic mixed normality (along subsequences) of the model

generating the sanitized observations

, where

is an arbitrary sequence of sequentially interactive privacy mechanisms. This result readily implies convolution and local asymptotic minimax theorems. In case

, the optimal asymptotic variance is found to be the inverse of the supremal Fisher-Information

, where the supremum runs over all

-differentially private (marginal) Markov kernels. We present an algorithm for finding a (nearly) optimal privacy mechanism

and an estimator

based on the corresponding sanitized data that achieves this asymptotically optimal variance.

Paper Structure (38 sections, 28 theorems, 175 equations, 5 figures)

This paper contains 38 sections, 28 theorems, 175 equations, 5 figures.

Introduction
Related literature
Preliminaries and notation
Sequentially interactive and non-interactive differential privacy
Two simple illustrative examples
Bernoulli data
Binomial $(2,\theta)$ data
Asymptotic lower bounds for estimation in locally private models
Regularity of locally private models
LAMN along subsequences
A private convolution theorem
Efficiency of the private two-step MLE
Consistent non-interactive $\alpha$-private estimation
Maximum Likelihood
Method of moments
...and 23 more sections

Key Result

Lemma 3.1

Fix $\alpha\in(0,\infty)$. Suppose the model ${\mathcal{P}}=(P_\theta)_{\theta\in\Theta}$ is DQM at $\theta\in\Theta\subseteq{\mathbb R}^p$, with score function $s_\theta$, and $Q\in\mathcal{Q}_\alpha(\mathcal{X}\to\mathcal{Z})$ is a privacy mechanism. Then the model $Q{\mathcal{P}}=(QP_\theta)_{\th that is, a regular conditional expectation of $s_\theta(X)$ given $Z=z$, where the joint distributi

Figures (5)

Figure 1: Graphical representation of the efficient two-step $\alpha$-sequentially interactive privacy mechanism and estimation procedure.
Figure 2: Optimal values of \ref{['eq:maxGauss']} for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$. For $k\to\infty$ theory predicts that these values converge to the global optimum $\sup_{Q\in\mathcal{Q}_\alpha(\mathcal{X})} I_0(Q{\mathcal{P}})$.
Figure 3: Number of non-zero rows (vertical axis), i.e, size of the output alphabet $\hat{\mathcal{Z}} \subseteq [k]$, of the optimal privacy mechanism $\hat{Q}\in\mathcal{Q}_\alpha([k]\to\hat{\mathcal{Z}})$ that maximizes \ref{['eq:maxGauss']}, for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$.
Figure 4: Optimal values of private Fisher-Information $Q\mapsto I_1(QT_{n_1,1}{\mathcal{P}}_{scale})$ in the Gaussian scale model for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$. Theory predicts that these values converge to the global optimum $\sup_{Q\in\mathcal{Q}_\alpha(\mathcal{X})} I_1(Q{\mathcal{P}}_{scale})$, for $k\to\infty$.
Figure 5: Number of non-zero rows, i.e, size of the output alphabet $\hat{\mathcal{Z}} \subseteq [k]$, of the optimal privacy mechanism $\hat{Q}\in\mathcal{Q}_\alpha([k]\to\hat{\mathcal{Z}})$ in the Gaussian scale model, for different resolution levels $k$ (horizontal axis) and privacy parameters $\alpha$.

Theorems & Definitions (53)

Remark 2.1: On the $L^1$-information of Duchi23
Definition 1: Differentiability in Quadratic Mean
Lemma 3.1
proof
Definition 2: Local Asymptotic Mixed Normality
Lemma 3.2
proof
Theorem 3.3
Remark 3.4
Theorem 3.5
...and 43 more

Efficiency in local differential privacy

TL;DR

Abstract

Efficiency in local differential privacy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (53)