On Purely Private Covariance Estimation
Tommaso d'Orsi, Gleb Novikov
TL;DR
This work presents a simple yet powerful approach to private covariance estimation under pure differential privacy by combining an additive perturbation with a projection step. The perturbation uses a nuclear-norm based noise model that achieves $ε$-DP and yields tight $S_p$-norm error bounds uniformly across Schatten norms, including optimal spectral-norm guarantees in the large-sample regime. A projection onto the nuclear-norm ball further improves Frobenius error, leveraging the trace of $Σ$ to attain near-optimal performance in small data regimes and surpass prior bounds. Collectively, the method delivers information-theoretically optimal performance for $p≥2$ and provides practical, scalable private covariance estimation across regimes, with implications for private data releases in high dimensions.
Abstract
We present a simple perturbation mechanism for the release of $d$-dimensional covariance matrices $Σ$ under pure differential privacy. For large datasets with at least $n\geq d^2/\varepsilon$ elements, our mechanism recovers the provably optimal Frobenius norm error guarantees of \cite{nikolov2023private}, while simultaneously achieving best known error for all other $p$-Schatten norms, with $p\in [1,\infty]$. Our error is information-theoretically optimal for all $p\ge 2$, in particular, our mechanism is the first purely private covariance estimator that achieves optimal error in spectral norm. For small datasets $n< d^2/\varepsilon$, we further show that by projecting the output onto the nuclear norm ball of appropriate radius, our algorithm achieves the optimal Frobenius norm error $O(\sqrt{d\;\text{Tr}(Σ) /n})$, improving over the known bounds of $O(\sqrt{d/n})$ of \cite{nikolov2023private} and ${O}\big(d^{3/4}\sqrt{\text{Tr}(Σ)/n}\big)$ of \cite{dong2022differentially}.
