Table of Contents
Fetching ...

A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning

Spencer Giddens, Yiwang Zhou, Kevin R. Krull, Tara M. Brinkman, Peter X. K. Song, Fang Liu

TL;DR

All empirical results demonstrate the feasibility of training OWL models via wERM with DP guarantees while maintaining sufficiently robust model performance, providing strong evidence for the practicality of implementing the proposed privacy-preserving OWL procedure in real-world scenarios involving sensitive data.

Abstract

It is common practice to use data containing personal information to build predictive models in the framework of empirical risk minimization (ERM). While these models can be highly accurate in prediction, sharing the results from these models trained on sensitive data may be susceptible to privacy attacks. Differential privacy (DP) is an appealing framework for addressing such data privacy issues by providing mathematically provable bounds on the privacy loss incurred when releasing information from sensitive data. Previous work has primarily concentrated on applying DP to unweighted ERM. We consider weighted ERM (wERM), an important generalization, where each individual's contribution to the objective function can be assigned varying weights. We propose the first differentially private algorithm for general wERM, with theoretical DP guarantees. Extending the existing DP-ERM procedures to wERM creates a pathway for deriving privacy-preserving learning methods for individualized treatment rules, including the popular outcome weighted learning (OWL). We evaluate the performance of the DP-wERM framework applied to OWL in both simulation studies and in a real clinical trial. All empirical results demonstrate the feasibility of training OWL models via wERM with DP guarantees while maintaining sufficiently robust model performance, providing strong evidence for the practicality of implementing the proposed privacy-preserving OWL procedure in real-world scenarios involving sensitive data.

A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning

TL;DR

All empirical results demonstrate the feasibility of training OWL models via wERM with DP guarantees while maintaining sufficiently robust model performance, providing strong evidence for the practicality of implementing the proposed privacy-preserving OWL procedure in real-world scenarios involving sensitive data.

Abstract

It is common practice to use data containing personal information to build predictive models in the framework of empirical risk minimization (ERM). While these models can be highly accurate in prediction, sharing the results from these models trained on sensitive data may be susceptible to privacy attacks. Differential privacy (DP) is an appealing framework for addressing such data privacy issues by providing mathematically provable bounds on the privacy loss incurred when releasing information from sensitive data. Previous work has primarily concentrated on applying DP to unweighted ERM. We consider weighted ERM (wERM), an important generalization, where each individual's contribution to the objective function can be assigned varying weights. We propose the first differentially private algorithm for general wERM, with theoretical DP guarantees. Extending the existing DP-ERM procedures to wERM creates a pathway for deriving privacy-preserving learning methods for individualized treatment rules, including the popular outcome weighted learning (OWL). We evaluate the performance of the DP-wERM framework applied to OWL in both simulation studies and in a real clinical trial. All empirical results demonstrate the feasibility of training OWL models via wERM with DP guarantees while maintaining sufficiently robust model performance, providing strong evidence for the practicality of implementing the proposed privacy-preserving OWL procedure in real-world scenarios involving sensitive data.
Paper Structure (22 sections, 3 theorems, 22 equations, 1 figure, 1 table, 2 algorithms)

This paper contains 22 sections, 3 theorems, 22 equations, 1 figure, 1 table, 2 algorithms.

Key Result

Lemma 1

chaudhuri2011 Let $h_1: \mathbb{R}^p \rightarrow \mathbb{R}$ and $h_2 : \mathbb{R}^p \rightarrow \mathbb{R}$ be everywhere first-order differentiable functions. Assume $h_1(\boldsymbol{\theta})$ and $h_1(\boldsymbol{\theta}) + h_2(\boldsymbol{\theta})$ are both $\Lambda$-strongly convex. If $\hat{\b

Figures (1)

  • Figure 1: Mean (95% CI) privacy-preserving optimal treatment assignment accuracy rate and empirical treatment value on the test set over 200 repeats. The large $\epsilon$ are presented to demonstrate the consistency of the proposed DP-wERM procedure (accuracy and empirical treatment value approach those in non-private case ($\epsilon=\infty$) as $\epsilon$ increases). Numerical values corresponding to the plots can be found in the supplemental materials.

Theorems & Definitions (12)

  • Definition 1: weighted empirical risk minimization (wERM)
  • Definition 2: neighboring datasets Dwork2006
  • Definition 3: $(\epsilon, \delta)$-differential privacy Dwork2006b
  • Definition 4: $\ell_p$-global sensitivityLiu2019
  • Definition 5: $\Lambda$-strong convexity
  • Lemma 1
  • Theorem 1: main result
  • Remark 1
  • Remark 2
  • Definition 6: Huber loss
  • ...and 2 more