Efficient Estimation of a Gaussian Mean with Local Differential Privacy
Nikita P. Kalinin, Lukas Steinberger
TL;DR
This work solves the Gaussian mean estimation problem under local differential privacy by identifying the sign mechanism with randomized response on $\mathrm{sgn}(X-\theta)$ as optimally informative in the high-privacy regime ($\epsilon\le 1.04$). To overcome the dependency on the unknown $\theta$, the authors develop a two-stage estimator that first privately estimates $\theta$ and then applies the sign-based mechanism with the estimated parameter, achieving asymptotically efficient variance. They derive a complete proof via a discrete approximation, a linear-programming reformulation, and a duality argument, establishing that the sign mechanism attains the maximal Fisher-information and thus the minimal asymptotic variance among all $\epsilon$-private mechanisms in this regime. The results extend to known-variance settings with a simple rescaling and demonstrate practical gains through simulations, while discussing limitations for larger privacy budgets and higher-dimensional problems. The work provides a precise, closed-form solution to an important LDP estimation problem and offers a concrete, tunable procedure for efficient private mean estimation.
Abstract
In this paper we study the problem of estimating the unknown mean $θ$ of a unit variance Gaussian distribution in a locally differentially private (LDP) way. In the high-privacy regime ($ε\le 1$), we identify an optimal privacy mechanism that minimizes the variance of the estimator asymptotically. Our main technical contribution is the maximization of the Fisher-Information of the sanitized data with respect to the local privacy mechanism $Q$. We find that the exact solution $Q_{θ,ε}$ of this maximization is the sign mechanism that applies randomized response to the sign of $X_i-θ$, where $X_1,\dots, X_n$ are the confidential iid original samples. However, since this optimal local mechanism depends on the unknown mean $θ$, we employ a two-stage LDP parameter estimation procedure which requires splitting agents into two groups. The first $n_1$ observations are used to consistently but not necessarily efficiently estimate the parameter $θ$ by $\tildeθ_{n_1}$. Then this estimate is updated by applying the sign mechanism with $\tildeθ_{n_1}$ instead of $θ$ to the remaining $n-n_1$ observations, to obtain an LDP and efficient estimator of the unknown mean.
