Table of Contents
Fetching ...

Analyzing the Differentially Private Theil-Sen Estimator for Simple Linear Regression

Jayshree Sarathy, Salil Vadhan

Abstract

In this paper, we study differentially private point and confidence interval estimators for simple linear regression. Motivated by recent work that highlights the strong empirical performance of an algorithm based on robust statistics, DPTheilSen, we provide a rigorous, finite-sample analysis of its privacy and accuracy properties, offer guidance on setting hyperparameters, and show how to produce differentially private confidence intervals to accompany its point estimates.

Analyzing the Differentially Private Theil-Sen Estimator for Simple Linear Regression

Abstract

In this paper, we study differentially private point and confidence interval estimators for simple linear regression. Motivated by recent work that highlights the strong empirical performance of an algorithm based on robust statistics, DPTheilSen, we provide a rigorous, finite-sample analysis of its privacy and accuracy properties, offer guidance on setting hyperparameters, and show how to produce differentially private confidence intervals to accompany its point estimates.
Paper Structure (25 sections, 25 theorems, 87 equations, 1 figure, 2 tables, 5 algorithms)

This paper contains 25 sections, 25 theorems, 87 equations, 1 figure, 2 tables, 5 algorithms.

Key Result

Theorem 1.2.1

Let $\widetilde{\beta_1}^{\texttt{DPWide}\textrm{TS}}$ be the DPWideTS estimator with privacy loss parameter $\varepsilon$, hyperparameter $R$ for the range of the outputs, and hyperparameter $\theta$ for the granularity of the outputs. Assume that the true slope $\beta_1$ lies in the interval $[-R+ where $\Phi^{-1}$ is the inverse standard normal distribution function. Then, for suff. large $n$,

Figures (1)

  • Figure 1: Illustration of the standard non-private Theil-Sen algorithm Theil50Sen68, which (1) computes the slopes between all pairs of points, and (2) outputs the median slope in dark blue.

Theorems & Definitions (49)

  • Theorem 1.2.1: Main result applied for case of asymptotically optimal design, informally stated
  • Definition 2.0.1: Differential Privacy DMNS06
  • Definition 2.0.2: Theil-Sen Estimator Theil50Sen68
  • Lemma 3.0.1: alabi2022differentially
  • Definition 4.0.1: U-statistic for simple linear regression Sen68
  • Theorem 4.0.2: Convergence bound for $\widetilde{\beta_1}^{\texttt{DPWide}\textrm{TS}}$
  • Theorem 4.2.1
  • Definition B.0.1: Exponential Mechanism McSherryT07
  • Theorem B.0.3: Utility of DPWide
  • proof
  • ...and 39 more