Table of Contents
Fetching ...

Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

Mengyue Zha

TL;DR

The paper develops near-optimal differentially private methods for estimating a low-rank matrix under trace regression with Gaussian design. It introduces a DP-initialization procedure with provable privacy and utility guarantees and a DP-RGrad algorithm that achieves near-optimal convergence rates starting from a good initialization. A DP-Fano based minimax lower bound shows fundamental limits under standard DP, while the DP-RGrad upper bound matches these limits up to logarithmic factors with an extra $\sigma_r$-dependent cost; a weaker notion of DP is shown to recover the optimal rate. The work advances understanding of initialization, sensitivity analysis via spectral representations, and the trade-offs between privacy, sample size, and estimation accuracy in DP low-rank trace regression.

Abstract

We study differentially private (DP) estimation of a rank-$r$ matrix $M \in \mathbb{R}^{d_1\times d_2}$ under the trace regression model with Gaussian measurement matrices. Theoretically, the sensitivity of non-private spectral initialization is precisely characterized, and the differential-privacy-constrained minimax lower bound for estimating $M$ under the Schatten-$q$ norm is established. Methodologically, the paper introduces a computationally efficient algorithm for DP-initialization with a sample size of $n \geq \widetilde O (r^2 (d_1\vee d_2))$. Under certain regularity conditions, the DP-initialization falls within a local ball surrounding $M$. We also propose a differentially private algorithm for estimating $M$ based on Riemannian optimization (DP-RGrad), which achieves a near-optimal convergence rate with the DP-initialization and sample size of $n \geq \widetilde O(r (d_1 + d_2))$. Finally, the paper discusses the non-trivial gap between the minimax lower bound and the upper bound of low-rank matrix estimation under the trace regression model. It is shown that the estimator given by DP-RGrad attains the optimal convergence rate in a weaker notion of differential privacy. Our powerful technique for analyzing the sensitivity of initialization requires no eigengap condition between $r$ non-zero singular values.

Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

TL;DR

The paper develops near-optimal differentially private methods for estimating a low-rank matrix under trace regression with Gaussian design. It introduces a DP-initialization procedure with provable privacy and utility guarantees and a DP-RGrad algorithm that achieves near-optimal convergence rates starting from a good initialization. A DP-Fano based minimax lower bound shows fundamental limits under standard DP, while the DP-RGrad upper bound matches these limits up to logarithmic factors with an extra -dependent cost; a weaker notion of DP is shown to recover the optimal rate. The work advances understanding of initialization, sensitivity analysis via spectral representations, and the trade-offs between privacy, sample size, and estimation accuracy in DP low-rank trace regression.

Abstract

We study differentially private (DP) estimation of a rank- matrix under the trace regression model with Gaussian measurement matrices. Theoretically, the sensitivity of non-private spectral initialization is precisely characterized, and the differential-privacy-constrained minimax lower bound for estimating under the Schatten- norm is established. Methodologically, the paper introduces a computationally efficient algorithm for DP-initialization with a sample size of . Under certain regularity conditions, the DP-initialization falls within a local ball surrounding . We also propose a differentially private algorithm for estimating based on Riemannian optimization (DP-RGrad), which achieves a near-optimal convergence rate with the DP-initialization and sample size of . Finally, the paper discusses the non-trivial gap between the minimax lower bound and the upper bound of low-rank matrix estimation under the trace regression model. It is shown that the estimator given by DP-RGrad attains the optimal convergence rate in a weaker notion of differential privacy. Our powerful technique for analyzing the sensitivity of initialization requires no eigengap condition between non-zero singular values.
Paper Structure (41 sections, 24 theorems, 164 equations, 2 algorithms)

This paper contains 41 sections, 24 theorems, 164 equations, 2 algorithms.

Key Result

Lemma 1

Under the Assumption assumption_for_X, for any $B\in \mathbb{R}^{d_1 \times d_2}$ of rank $r$, there exist constants $c_1, c_2, c_3>0$ and $c_5>c_4>0$ such that if $n \geq c_1 r(d_1 + d_2)$, with probability at least $1-c_2 \exp \left(-c_3 r(d_1+d_2)\right)$, we have $c_4 \sqrt{C_u C_l}\|B\|_{\mathr

Theorems & Definitions (29)

  • Lemma 1
  • Lemma 2: Spectral representation formula
  • Theorem 1: Privacy and utility guarantees of the initialization $\widetilde{M}_0$
  • Corollary 1
  • Lemma 3: DP-Fano's lemma, acharya2021differentially
  • Theorem 2
  • Theorem 3
  • Definition 1: weak $(\varepsilon, \delta)$-differential privacy
  • Theorem 4: Bernstein's inequality, koltchinskii2011neumann
  • Lemma 4: Matrix Permutation, shen2023computationally
  • ...and 19 more