Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation

Gavin Brown; Krishnamurthy Dvijotham; Georgina Evans; Daogao Liu; Adam Smith; Abhradeep Thakurta

Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation

Gavin Brown, Krishnamurthy Dvijotham, Georgina Evans, Daogao Liu, Adam Smith, Abhradeep Thakurta

TL;DR

This paper studies private linear regression under squared loss and develops a refined analysis of Gaussian-damped gradient descent with differential privacy. By characterizing DP-GD iterates as a Gaussian process around the empirical minimizer, it achieves a dimension-efficient sample complexity of $n=\tilde{\Theta}(p)$ under Gaussian design and enables instance-specific uncertainty estimates through finite-sample confidence intervals. The authors provide formal guarantees for accuracy and confidence intervals, along with practical methods to construct per-coordinate intervals using independent runs, checkpoints, or all iterates, and validate these results with synthetic experiments. The work advances understanding of privacy-utility trade-offs in high-dimensional private regression and offers practical, automatically adapting uncertainty quantification without extra privacy cost.

Abstract

We provide an improved analysis of standard differentially private gradient descent for linear regression under the squared error loss. Under modest assumptions on the input, we characterize the distribution of the iterate at each time step. Our analysis leads to new results on the algorithm's accuracy: for a proper fixed choice of hyperparameters, the sample complexity depends only linearly on the dimension of the data. This matches the dimension-dependence of the (non-private) ordinary least squares estimator as well as that of recent private algorithms that rely on sophisticated adaptive gradient-clipping schemes (Varshney et al., 2022; Liu et al., 2023). Our analysis of the iterates' distribution also allows us to construct confidence intervals for the empirical optimizer which adapt automatically to the variance of the algorithm on a particular data set. We validate our theorems through experiments on synthetic data.

Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation

TL;DR

under Gaussian design and enables instance-specific uncertainty estimates through finite-sample confidence intervals. The authors provide formal guarantees for accuracy and confidence intervals, along with practical methods to construct per-coordinate intervals using independent runs, checkpoints, or all iterates, and validate these results with synthetic experiments. The work advances understanding of privacy-utility trade-offs in high-dimensional private regression and offers practical, automatically adapting uncertainty quantification without extra privacy cost.

Abstract

Paper Structure (46 sections, 15 theorems, 65 equations, 11 figures, 1 algorithm)

This paper contains 46 sections, 15 theorems, 65 equations, 11 figures, 1 algorithm.

INTRODUCTION
Our Results
Formal Guarantees for Accuracy
Formal Guarantees for Confidence Intervals
Experiments
Techniques
Confidence Intervals
Limitations and Future Work
Related Work
Private Linear Regression
Sufficient Statistics
Private Confidence Intervals
Private Gradient Descent for Regression
Notation
Algorithm
...and 31 more sections

Key Result

Theorem 1.2

Assume we are in the generative setting (Definition def:generative_setting). Assume $n=\widetilde{\Omega}( p )$. Set clipping threshold $\gamma = \widetilde{\Theta}(\sigma \sqrt{p})$, step size $\eta = O(1)$, and number of steps $T = \widetilde{O}(1)$. With high probability the final iterate $\theta

Figures (11)

Figure 1: We fix the ratio $p/n$ and let $p$ grow, with $\rho = 0.05$. Run on data from a well-specified linear model, both DP-GD and OLS have constant error. We compare with AdaSSP wang2018revisiting, a popular algorithm that requires $n=\Omega(p^{3/2})$ examples.
Figure 2: The "cost of privacy:" fixing the dimension and allowing the sample size to grow, we see how the error due to sampling dominates the error from privacy. Each point is averaged over 100 independent trials.
Figure 3: We see the fraction of gradients clipped over a grid on dimension and clipping threshold. As the theory predicts, we see low clipping with $\gamma=\Omega(\sqrt{p})$.
Figure 4: We plot the error, (squared) bias, and variance of DP-GD as we change the clipping threshold. The numeric labels give the percentage of all gradients clipped. Low thresholds cause high clipping and bias, while high thresholds have little clipping but high variance.
Figure 5: Average empirical coverage across co-ordinates over 100 algorithm runs. Error bars reflect the $95$-percentiles of coverage across coordinates.
...and 6 more figures

Theorems & Definitions (37)

Definition 1.1: Generative Setting
Theorem 1.2: Informal
Lemma 2.1
Lemma 2.2: Coupling DP-GD without Clipping
Lemma 2.3
Lemma 2.5: No Clipping Occurs
Lemma 2.6
Theorem 2.7: Main Accuracy Claim
Theorem 3.1: Coverage
Definition 1.1: Approximate Differential Privacy
...and 27 more

Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation

TL;DR

Abstract

Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (37)