Table of Contents
Fetching ...

Private Low-Rank Approximation for Covariance Matrices, Dyson Brownian Motion, and Eigenvalue-Gap Bounds for Gaussian Perturbations

Oren Mangoubi, Nisheeth K. Vishnoi

TL;DR

This paper studies privately releasing a rank-$k$ covariance approximation under $(\varepsilon,\delta)$-differential privacy by introducing a complex Gaussian mechanism and analyzing it through a matrix-Dyson diffusion lens. By treating Gaussian perturbations as a continuous-time Dyson Brownian motion, the authors derive SDEs for the evolving eigenvalues and eigenvectors and obtain a time-integrated bound on the Frobenius utility that scales like $\tilde{O}(\sqrt{kd})\,\frac{\sigma_k}{\sigma_k-\sigma_{k+1}}\,\frac{\sqrt{\log(1/\delta)}}{\varepsilon}$ under a sizeable $k$-th eigengap, outperforming previous results in gap-rich regimes. They further prove high-probability eigenvalue-gap bounds for GUE/GOE perturbations, showing that gaps between adjacent eigenvalues remain sizable with probability bounds of the form $\mathbb{P}(\eta_i-\eta_{i+1} \le c\,s/\sqrt{d}) \le s^{\beta+1}+d^{-1000}$, where $\beta=2$ (complex) or $1$ (real). The framework extends to private subspace recovery and provides insights into average-case perturbations and random-matrix eigenvalue gaps, with potential broader impact in privacy-preserving data analysis and numerical linear algebra under random noise. Overall, the work combines stochastic calculus, random matrix theory, and DP to yield tighter, structure-exploiting guarantees for private low-rank covariance approximation. Key techniques include dynamical eigenspace tracking via Dyson Brownian motion and carefully designed time-varying rank-$k$ projections that mitigate ill-conditioning from small eigen-gaps.

Abstract

We consider the problem of approximating a $d \times d$ covariance matrix $M$ with a rank-$k$ matrix under $(\varepsilon,δ)$-differential privacy. We present and analyze a complex variant of the Gaussian mechanism and obtain upper bounds on the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$. Our analysis provides improvements over previous bounds, particularly when the spectrum of $M$ satisfies natural structural assumptions. The novel insight is to view the addition of Gaussian noise to a matrix as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations enable us to upper bound the Frobenius distance between the best rank-$k$ approximation of $M$ and that of a Gaussian perturbation of $M$ as an integral that involves inverse eigenvalue gaps of the stochastically evolving matrix, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems. Subsequently, again using the Dyson Brownian motion viewpoint, we show that the eigenvalues of the matrix $M$ perturbed by Gaussian noise have large gaps with high probability. These results also contribute to the analysis of low-rank approximations under average-case perturbations, and to an understanding of eigenvalue gaps for random matrices, both of which may be of independent interest.

Private Low-Rank Approximation for Covariance Matrices, Dyson Brownian Motion, and Eigenvalue-Gap Bounds for Gaussian Perturbations

TL;DR

This paper studies privately releasing a rank- covariance approximation under -differential privacy by introducing a complex Gaussian mechanism and analyzing it through a matrix-Dyson diffusion lens. By treating Gaussian perturbations as a continuous-time Dyson Brownian motion, the authors derive SDEs for the evolving eigenvalues and eigenvectors and obtain a time-integrated bound on the Frobenius utility that scales like under a sizeable -th eigengap, outperforming previous results in gap-rich regimes. They further prove high-probability eigenvalue-gap bounds for GUE/GOE perturbations, showing that gaps between adjacent eigenvalues remain sizable with probability bounds of the form , where (complex) or (real). The framework extends to private subspace recovery and provides insights into average-case perturbations and random-matrix eigenvalue gaps, with potential broader impact in privacy-preserving data analysis and numerical linear algebra under random noise. Overall, the work combines stochastic calculus, random matrix theory, and DP to yield tighter, structure-exploiting guarantees for private low-rank covariance approximation. Key techniques include dynamical eigenspace tracking via Dyson Brownian motion and carefully designed time-varying rank- projections that mitigate ill-conditioning from small eigen-gaps.

Abstract

We consider the problem of approximating a covariance matrix with a rank- matrix under -differential privacy. We present and analyze a complex variant of the Gaussian mechanism and obtain upper bounds on the Frobenius norm of the difference between the matrix output by this mechanism and the best rank- approximation to . Our analysis provides improvements over previous bounds, particularly when the spectrum of satisfies natural structural assumptions. The novel insight is to view the addition of Gaussian noise to a matrix as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations enable us to upper bound the Frobenius distance between the best rank- approximation of and that of a Gaussian perturbation of as an integral that involves inverse eigenvalue gaps of the stochastically evolving matrix, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems. Subsequently, again using the Dyson Brownian motion viewpoint, we show that the eigenvalues of the matrix perturbed by Gaussian noise have large gaps with high probability. These results also contribute to the analysis of low-rank approximations under average-case perturbations, and to an understanding of eigenvalue gaps for random matrices, both of which may be of independent interest.

Paper Structure

This paper contains 80 sections, 37 theorems, 300 equations, 3 figures, 1 algorithm.

Key Result

Theorem 2.2

Given $\varepsilon,\delta>0$, there is an $(\varepsilon, \delta)$-differentially private algorithm (Algorithm alg_quaternion_Gaussian) that, on input $k>0$ and a real symmetric PSD matrix $M \in \mathbb{R}^{d \times d}$ with eigenvalues $\sigma_1 \geq \cdots \geq \sigma_d \geq 0$ satisfying Assumpti $M_k$ is the Frobenius-norm minimizing rank-$k$ approximation to $M$.

Figures (3)

  • Figure 1: One run of a simulation of the eigenvalues $\gamma_1(t) \geq \cdots \geq \gamma_d(t)$ of Dyson Brownian, in the real case (left) and the complex case (right) with initial condition $\gamma_1(0) = \cdots = \gamma_d = 0$, for $d=6$. In the complex case, eigenvalue repulsion is stronger and the gaps between the eigenvalues are not as small as in the real case.
  • Figure 2: A diagram showing the structure of the proof of Theorems \ref{['thm_rank_k_covariance_approximation_new']} and \ref{['thm_utility']}. Lower-level lemmas and propositions are denoted by subdued dashed boxes. (See Figure \ref{['fig_proof_diagram_2']} for a diagram of the structure of the proof of Theorem \ref{['thm:eigenvalue_gap']}.)
  • Figure 3: A diagram of the proof of Theorem \ref{['thm:eigenvalue_gap']}. Lemmas and Propositions below Lemma \ref{['lemma_GUE_gaps']} are separated into results dealing with the "bulk" of the eigenvalue spectrum, and analogous (but slightly simpler to prove) results dealing with the "edge" of the eigenvalue spectrum (denoted by blue boxes). Throughout the diagram, lower-level propositions are denoted by subdued dashed boxes.

Theorems & Definitions (68)

  • Theorem 2.2: Private low-rank covariance approximation
  • Theorem 2.3: Frobenius bound for complex Gaussian perturbations
  • Theorem 2.4: Eigenvalue gaps of Gaussian Unitary Ensemble (GUE) and Gaussian Orthogonal Ensemble (GOE)
  • Definition 3.1: Itô Integral
  • Lemma 3.1: Itô's Lemma, integral form with no drift; Theorem 3.7.1 of lawler2010stochastic
  • Definition 3.2: Strong solution to SDE; Definition 5.3.1 in karatzas1991brownian
  • Lemma 3.2: Existence and uniqueness of solutions to Dyson Brownian motion
  • Lemma 3.3: Continuity w.r.t. initial condition; Proposition 4.3.5 in anderson2010introduction
  • Lemma 3.4: Non-collision of Dyson Brownian motion for $\beta \geq 1$
  • Lemma 3.5: Theorem 4.4.5 of vershynin2018high, special case
  • ...and 58 more