Table of Contents
Fetching ...

PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects

Maresa Schröder, Justin Hartenstein, Stefan Feuerriegel

TL;DR

This work is the first to derive a general, doubly robust framework for valid CIs of the ATE under ($\varepsilon,\delta$)-differential privacy, and demonstrates the effectiveness of the framework using synthetic and real-world medical datasets.

Abstract

The average treatment effect (ATE) is widely used to evaluate the effectiveness of drugs and other medical interventions. In safety-critical applications like medicine, reliable inferences about the ATE typically require valid uncertainty quantification, such as through confidence intervals (CIs). However, estimating treatment effects in these settings often involves sensitive data that must be kept private. In this work, we present PrivATE, a novel machine learning framework for computing CIs for the ATE under differential privacy. Specifically, we focus on deriving valid privacy-preserving CIs for the ATE from observational data. Our PrivATE framework consists of three steps: (i) estimating the differentially private ATE through output perturbation; (ii) estimating the differentially private variance in a doubly robust manner; and (iii) constructing the CIs while accounting for the uncertainty from both the estimation and privatization steps. Our PrivATE framework is model agnostic, doubly robust, and ensures valid CIs. We demonstrate the effectiveness of our framework using synthetic and real-world medical datasets. To the best of our knowledge, we are the first to derive a general, doubly robust framework for valid CIs of the ATE under ($\varepsilon,δ$)-differential privacy.

PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects

TL;DR

This work is the first to derive a general, doubly robust framework for valid CIs of the ATE under ()-differential privacy, and demonstrates the effectiveness of the framework using synthetic and real-world medical datasets.

Abstract

The average treatment effect (ATE) is widely used to evaluate the effectiveness of drugs and other medical interventions. In safety-critical applications like medicine, reliable inferences about the ATE typically require valid uncertainty quantification, such as through confidence intervals (CIs). However, estimating treatment effects in these settings often involves sensitive data that must be kept private. In this work, we present PrivATE, a novel machine learning framework for computing CIs for the ATE under differential privacy. Specifically, we focus on deriving valid privacy-preserving CIs for the ATE from observational data. Our PrivATE framework consists of three steps: (i) estimating the differentially private ATE through output perturbation; (ii) estimating the differentially private variance in a doubly robust manner; and (iii) constructing the CIs while accounting for the uncertainty from both the estimation and privatization steps. Our PrivATE framework is model agnostic, doubly robust, and ensures valid CIs. We demonstrate the effectiveness of our framework using synthetic and real-world medical datasets. To the best of our knowledge, we are the first to derive a general, doubly robust framework for valid CIs of the ATE under ()-differential privacy.

Paper Structure

This paper contains 35 sections, 7 theorems, 39 equations, 6 figures, 3 tables.

Key Result

Lemma 4.2

We make the following assumptions on the nuisance functions: The nuisance functions are (i) bounded, (ii) estimated at rates $n^{-\beta_{\mu}}$ and $n^{-\beta_{\pi}}$ with $\beta_{\mu} + \beta_{\pi} \geq \frac{1}{2}$, and (iii) in a local neighborhood of the true nuisance functions, i.e., there exis

Figures (6)

  • Figure 1: Standard vs. private CIs for the ATE. Standard CIs allow for inference about individual data samples, whereas CIs under DP are designed to prevent such inference.
  • Figure 2: Key literature for estimating CIs of the ATE.
  • Figure 3: Our proposed PrivATE framework for constructing $(\varepsilon, \delta)$-differentially private CIs for the ATE.
  • Figure 4: Performance across different privacy budgets and sample sizes. Results for Dataset 1 over 10 runs with base values $\varepsilon=0.5$, $\delta=10^{-5}$, $n=3000$. The standard CIs only incorporate the sampling uncertainty (privacy considerations). $\Rightarrow$The results confirm our theoretical intuition: with larger budget or large $n$, our CIs approach the standard CIs.
  • Figure 5: Privacy-utility trade-off. We report the utility curves with respect to the privacy budget for various utility functions (confidence level 0.95). As expected, we observe our PrivATE to outperform the naive and the bootstrapping method with increasing utility for increasing weight on the privacy constraint. The utility of the naive CIs does not vary significantly for different utility weightings.
  • ...and 1 more figures

Theorems & Definitions (20)

  • Definition 3.1: Differential privacy Dwork.2009
  • Definition 3.2: Gaussian privacy mechanism Dwork.2014
  • Definition 4.1
  • Lemma 4.2
  • proof
  • Theorem 4.3: ATE privatization
  • proof
  • Lemma 4.4
  • proof
  • Theorem 4.5: Differentially private variance estimation
  • ...and 10 more