Table of Contents
Fetching ...

On Purely Private Covariance Estimation

Tommaso d'Orsi, Gleb Novikov

TL;DR

This work presents a simple yet powerful approach to private covariance estimation under pure differential privacy by combining an additive perturbation with a projection step. The perturbation uses a nuclear-norm based noise model that achieves $ε$-DP and yields tight $S_p$-norm error bounds uniformly across Schatten norms, including optimal spectral-norm guarantees in the large-sample regime. A projection onto the nuclear-norm ball further improves Frobenius error, leveraging the trace of $Σ$ to attain near-optimal performance in small data regimes and surpass prior bounds. Collectively, the method delivers information-theoretically optimal performance for $p≥2$ and provides practical, scalable private covariance estimation across regimes, with implications for private data releases in high dimensions.

Abstract

We present a simple perturbation mechanism for the release of $d$-dimensional covariance matrices $Σ$ under pure differential privacy. For large datasets with at least $n\geq d^2/\varepsilon$ elements, our mechanism recovers the provably optimal Frobenius norm error guarantees of \cite{nikolov2023private}, while simultaneously achieving best known error for all other $p$-Schatten norms, with $p\in [1,\infty]$. Our error is information-theoretically optimal for all $p\ge 2$, in particular, our mechanism is the first purely private covariance estimator that achieves optimal error in spectral norm. For small datasets $n< d^2/\varepsilon$, we further show that by projecting the output onto the nuclear norm ball of appropriate radius, our algorithm achieves the optimal Frobenius norm error $O(\sqrt{d\;\text{Tr}(Σ) /n})$, improving over the known bounds of $O(\sqrt{d/n})$ of \cite{nikolov2023private} and ${O}\big(d^{3/4}\sqrt{\text{Tr}(Σ)/n}\big)$ of \cite{dong2022differentially}.

On Purely Private Covariance Estimation

TL;DR

This work presents a simple yet powerful approach to private covariance estimation under pure differential privacy by combining an additive perturbation with a projection step. The perturbation uses a nuclear-norm based noise model that achieves -DP and yields tight -norm error bounds uniformly across Schatten norms, including optimal spectral-norm guarantees in the large-sample regime. A projection onto the nuclear-norm ball further improves Frobenius error, leveraging the trace of to attain near-optimal performance in small data regimes and surpass prior bounds. Collectively, the method delivers information-theoretically optimal performance for and provides practical, scalable private covariance estimation across regimes, with implications for private data releases in high dimensions.

Abstract

We present a simple perturbation mechanism for the release of -dimensional covariance matrices under pure differential privacy. For large datasets with at least elements, our mechanism recovers the provably optimal Frobenius norm error guarantees of \cite{nikolov2023private}, while simultaneously achieving best known error for all other -Schatten norms, with . Our error is information-theoretically optimal for all , in particular, our mechanism is the first purely private covariance estimator that achieves optimal error in spectral norm. For small datasets , we further show that by projecting the output onto the nuclear norm ball of appropriate radius, our algorithm achieves the optimal Frobenius norm error , improving over the known bounds of of \cite{nikolov2023private} and of \cite{dong2022differentially}.

Paper Structure

This paper contains 11 sections, 6 theorems, 47 equations, 2 algorithms.

Key Result

Theorem 1.1

Let $\varepsilon>0$. Algorithm alg:main-schatten is $\varepsilon$-DP, runs in polynomial time, and with high probability its output $\hat{\Sigma}$ satisfies for every $p\in [1,\infty]$.

Theorems & Definitions (9)

  • Theorem 1.1
  • Theorem 1.2
  • Theorem 4.1
  • proof
  • Lemma 4.2: Nuclear-radial factorization
  • Lemma 4.3
  • proof
  • Theorem 4.4
  • proof