Better and Simpler Lower Bounds for Differentially Private Statistical Estimation
Shyam Narayanan
TL;DR
The paper derives optimal lower bounds for high-dimensional mean and covariance estimation under approximate differential privacy, achieving spectral-error guarantees. It employs fingerprinting-based lower bounds and a Bayesian prior (Inverse Wishart) over covariance to show that private estimators need at least $ n \\ge ilde{\\Omega}\left( \\frac{d}{\\alpha^2} + \\frac{d^{3/2}}{\\alpha \\\\varepsilon} \\right) $ samples for Gaussian covariance, and $ n \\ge ilde{\\Omega}\left( \\frac{d}{\\alpha^{\\frac{k}{k-1}} \\\\varepsilon} + \\frac{d}{\\alpha^2} \\right) $ for heavy-tailed mean estimation with bounded $k$th moments. The fingerprinting approach yields a clear route to both upper and lower bounds, and the results extend and tighten prior work, including improvements for empirical covariance estimation. A key implication is a dimension-based separation between robustness and privacy: robust spectral covariance estimation can be statistically easier than private spectral covariance estimation in high dimensions. Overall, the findings provide near-optimal, simple-to-implement lower bounds that align with existing upper bounds and sharpen our understanding of privacy-robustness trade-offs in high-dimensional statistical estimation.
Abstract
We provide optimal lower bounds for two well-known parameter estimation (also known as statistical estimation) tasks in high dimensions with approximate differential privacy. First, we prove that for any $α\le O(1)$, estimating the covariance of a Gaussian up to spectral error $α$ requires $\tildeΩ\left(\frac{d^{3/2}}{α\varepsilon} + \frac{d}{α^2}\right)$ samples, which is tight up to logarithmic factors. This result improves over previous work which established this for $α\le O\left(\frac{1}{\sqrt{d}}\right)$, and is also simpler than previous work. Next, we prove that estimating the mean of a heavy-tailed distribution with bounded $k$th moments requires $\tildeΩ\left(\frac{d}{α^{k/(k-1)} \varepsilon} + \frac{d}{α^2}\right)$ samples. Previous work for this problem was only able to establish this lower bound against pure differential privacy, or in the special case of $k = 2$. Our techniques follow the method of fingerprinting and are generally quite simple. Our lower bound for heavy-tailed estimation is based on a black-box reduction from privately estimating identity-covariance Gaussians. Our lower bound for covariance estimation utilizes a Bayesian approach to show that, under an Inverse Wishart prior distribution for the covariance matrix, no private estimator can be accurate even in expectation, without sufficiently many samples.
