Table of Contents
Fetching ...

Beyond Covariance Matrix: The Statistical Complexity of Private Linear Regression

Fan Chen, Jiachun Li, Alexander Rakhlin, David Simchi-Levi

TL;DR

This work develops a minimax theory for private linear regression under general covariate distributions and reveals that privacy complexity is governed by an $L_1$-analogue of the Fisher information, not the usual covariance. It introduces Information-Weighted Regression, computable in both Local and Global DP, and proves $L_1$ convergence and distribution-specific minimax optimality. The framework extends to dimension-free settings with near-minimax performance, and is applied to private linear contextual bandits to achieve rate-optimal regret under both joint and local privacy, addressing open questions about privacy-utility trade-offs. Overall, the approach unifies privacy-aware complexity via the matrices $oldsymbol{U}_{oldsymbol{ ext{λ}}}$ and $oldsymbol{W}_{oldsymbol{ ext{γ,λ}}}$, yielding practical, near-optimal private-learning and decision-making tools across DP settings.

Abstract

We study the statistical complexity of private linear regression under an unknown, potentially ill-conditioned covariate distribution. Somewhat surprisingly, under privacy constraints the intrinsic complexity is \emph{not} captured by the usual covariance matrix but rather its $L_1$ analogues. Building on this insight, we establish minimax convergence rates for both the central and local privacy models and introduce an Information-Weighted Regression method that attains the optimal rates. As application, in private linear contextual bandits, we propose an efficient algorithm that achieves rate-optimal regret bounds of order $\sqrt{T}+\frac{1}α$ and $\sqrt{T}/α$ under joint and local $α$-privacy models, respectively. Notably, our results demonstrate that joint privacy comes at almost no additional cost, addressing the open problems posed by Azize and Basu (2024).

Beyond Covariance Matrix: The Statistical Complexity of Private Linear Regression

TL;DR

This work develops a minimax theory for private linear regression under general covariate distributions and reveals that privacy complexity is governed by an -analogue of the Fisher information, not the usual covariance. It introduces Information-Weighted Regression, computable in both Local and Global DP, and proves convergence and distribution-specific minimax optimality. The framework extends to dimension-free settings with near-minimax performance, and is applied to private linear contextual bandits to achieve rate-optimal regret under both joint and local privacy, addressing open questions about privacy-utility trade-offs. Overall, the approach unifies privacy-aware complexity via the matrices and , yielding practical, near-optimal private-learning and decision-making tools across DP settings.

Abstract

We study the statistical complexity of private linear regression under an unknown, potentially ill-conditioned covariate distribution. Somewhat surprisingly, under privacy constraints the intrinsic complexity is \emph{not} captured by the usual covariance matrix but rather its analogues. Building on this insight, we establish minimax convergence rates for both the central and local privacy models and introduce an Information-Weighted Regression method that attains the optimal rates. As application, in private linear contextual bandits, we propose an efficient algorithm that achieves rate-optimal regret bounds of order and under joint and local -privacy models, respectively. Notably, our results demonstrate that joint privacy comes at almost no additional cost, addressing the open problems posed by Azize and Basu (2024).

Paper Structure

This paper contains 146 sections, 68 theorems, 434 equations, 1 figure, 2 tables, 17 algorithms.

Key Result

Lemma 3.1

Consider 1-dimensional linear models with covariate distribution $p$ (supported on $[-1,1]$). Suppose that $\alpha\in(0,1)$, $\beta\in[0,\frac{1}{T}]$. Then it holds that where $C_0, C_1>0$ are absolute constants.

Figures (1)

  • Figure 1: Illustration of the behavior of the information matrix $\mathbf{W}_{{\gamma,\lambda}}$, where $\gamma\asymp \frac{1}{\alpha\sqrt{T}}$ and $\lambda\asymp \frac{1}{\sqrt{T}}$ as in Eq. (\ref{['eq:JDP-minimax-demo']}), under different scaling of $(\alpha,T)$. In the regime $T\gg \frac{1}{\alpha^2}$ ($\gamma\ll1$, left), $\mathbf{W}_{{\gamma,\lambda}}$ scales as $(\boldsymbol{\Sigma}+\lambda^2\mathbf{I})^{{-1/2}}$\ref{['eq:W-to-cov']} and hence the optimal DP estimators (e.g., information-weighted regression estimator) achieve the non-private optimal rate \ref{['eq:Fisher-demo']}, i.e., privacy is "for free". On the other hand, in the "high privacy" regime $T\leq \frac{1}{\alpha^2}$ ($\gamma\geq 1$, right), the "cost of privacy" dominates the convergence rate as $\mathbf{W}_{{\gamma,\lambda}}$ scales as $\gamma \mathbf{U}_{\lambda\gamma}$, and the minimax-optimal DP rate reduces to Eq. (\ref{['eq:JDP-minimax-high-privacy']}). Therefore, the definition \ref{['def:W-demo']} of $\mathbf{W}_{{\gamma,\lambda}}$ can be interpreted as an interpolation between covariance matrix $\boldsymbol{\Sigma}$ and the LDP information matrix $\mathbf{U}_{\lambda}$.

Theorems & Definitions (83)

  • Definition 1: Linear model
  • Definition 2: DP channel
  • Definition 3: Gaussian channel
  • Definition 4: DP algorithms
  • Definition 5: LDP algorithms
  • Lemma 3.1
  • Proposition 3.2: Sub-optimality of smooth algorithms
  • Lemma 3.3: Sub-optimality of SSP; Central model
  • Example 1: Simple distributions
  • Proposition 3.4: Lower bounds under simple distributions
  • ...and 73 more