Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

Mengyue Zha

Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

Mengyue Zha

TL;DR

The paper develops near-optimal differentially private methods for estimating a low-rank matrix under trace regression with Gaussian design. It introduces a DP-initialization procedure with provable privacy and utility guarantees and a DP-RGrad algorithm that achieves near-optimal convergence rates starting from a good initialization. A DP-Fano based minimax lower bound shows fundamental limits under standard DP, while the DP-RGrad upper bound matches these limits up to logarithmic factors with an extra $\sigma_r$-dependent cost; a weaker notion of DP is shown to recover the optimal rate. The work advances understanding of initialization, sensitivity analysis via spectral representations, and the trade-offs between privacy, sample size, and estimation accuracy in DP low-rank trace regression.

Abstract

We study differentially private (DP) estimation of a rank-$r$ matrix $M \in \mathbb{R}^{d_1\times d_2}$ under the trace regression model with Gaussian measurement matrices. Theoretically, the sensitivity of non-private spectral initialization is precisely characterized, and the differential-privacy-constrained minimax lower bound for estimating $M$ under the Schatten-$q$ norm is established. Methodologically, the paper introduces a computationally efficient algorithm for DP-initialization with a sample size of $n \geq \widetilde O (r^2 (d_1\vee d_2))$. Under certain regularity conditions, the DP-initialization falls within a local ball surrounding $M$. We also propose a differentially private algorithm for estimating $M$ based on Riemannian optimization (DP-RGrad), which achieves a near-optimal convergence rate with the DP-initialization and sample size of $n \geq \widetilde O(r (d_1 + d_2))$. Finally, the paper discusses the non-trivial gap between the minimax lower bound and the upper bound of low-rank matrix estimation under the trace regression model. It is shown that the estimator given by DP-RGrad attains the optimal convergence rate in a weaker notion of differential privacy. Our powerful technique for analyzing the sensitivity of initialization requires no eigengap condition between $r$ non-zero singular values.

Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

TL;DR

-dependent cost; a weaker notion of DP is shown to recover the optimal rate. The work advances understanding of initialization, sensitivity analysis via spectral representations, and the trade-offs between privacy, sample size, and estimation accuracy in DP low-rank trace regression.

Abstract

We study differentially private (DP) estimation of a rank-

matrix

under the trace regression model with Gaussian measurement matrices. Theoretically, the sensitivity of non-private spectral initialization is precisely characterized, and the differential-privacy-constrained minimax lower bound for estimating

under the Schatten-

norm is established. Methodologically, the paper introduces a computationally efficient algorithm for DP-initialization with a sample size of

. Under certain regularity conditions, the DP-initialization falls within a local ball surrounding

. We also propose a differentially private algorithm for estimating

based on Riemannian optimization (DP-RGrad), which achieves a near-optimal convergence rate with the DP-initialization and sample size of

. Finally, the paper discusses the non-trivial gap between the minimax lower bound and the upper bound of low-rank matrix estimation under the trace regression model. It is shown that the estimator given by DP-RGrad attains the optimal convergence rate in a weaker notion of differential privacy. Our powerful technique for analyzing the sensitivity of initialization requires no eigengap condition between

non-zero singular values.

Paper Structure (41 sections, 24 theorems, 164 equations, 2 algorithms)

This paper contains 41 sections, 24 theorems, 164 equations, 2 algorithms.

Introduction
Trace regression model
Sensitivity
Differential privacy
Gaussian mechanism
RIP of Gaussian measurement matrices
Notations
Main results
Motivations and related works
Organization
DP-initialization
Algorithm for DP-initialization
Spetral representation formula
Symmetric dilation and auxiliary operators
Privacy and utility guarantees of the initialization
...and 26 more sections

Key Result

Lemma 1

Under the Assumption assumption_for_X, for any $B\in \mathbb{R}^{d_1 \times d_2}$ of rank $r$, there exist constants $c_1, c_2, c_3>0$ and $c_5>c_4>0$ such that if $n \geq c_1 r(d_1 + d_2)$, with probability at least $1-c_2 \exp \left(-c_3 r(d_1+d_2)\right)$, we have $c_4 \sqrt{C_u C_l}\|B\|_{\mathr

Theorems & Definitions (29)

Lemma 1
Lemma 2: Spectral representation formula
Theorem 1: Privacy and utility guarantees of the initialization $\widetilde{M}_0$
Corollary 1
Lemma 3: DP-Fano's lemma, acharya2021differentially
Theorem 2
Theorem 3
Definition 1: weak $(\varepsilon, \delta)$-differential privacy
Theorem 4: Bernstein's inequality, koltchinskii2011neumann
Lemma 4: Matrix Permutation, shen2023computationally
...and 19 more

Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

TL;DR

Abstract

Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (29)