Differentially Private Learners for Heterogeneous Treatment Effects
Maresa Schröder, Valentyn Melnychuk, Stefan Feuerriegel
TL;DR
The paper tackles privately estimating conditional average treatment effects from observational data by introducing DP-CATE, a Neyman-orthogonal, model-agnostic framework. It develops two output-perturbation schemes: a finite-queries variant that perturbs a vector of CATE estimates using influence-function-based noise calibration, and a functional-queries variant that privately releases the entire CATE function via Gaussian-process noise in an RKHS. The authors establish theoretical guarantees, including quasi-oracle efficiency, under $(\varepsilon,\delta)$-DP, and demonstrate robustness across synthetic and real-world medical datasets, with performance converging to non-private baselines as privacy budgets grow. This framework enables privacy-preserving, subgroup- or function-level CATE analysis in settings like personalized medicine, while remaining compatible with common meta-learners such as the R-learner and DR-learner. Overall, DP-CATE provides a flexible, principled approach to balancing data privacy with actionable individualized treatment effect estimation in observational data.
Abstract
Patient data is widely used to estimate heterogeneous treatment effects and thus understand the effectiveness and safety of drugs. Yet, patient data includes highly sensitive information that must be kept private. In this work, we aim to estimate the conditional average treatment effect (CATE) from observational data under differential privacy. Specifically, we present DP-CATE, a novel framework for CATE estimation that is Neyman-orthogonal and further ensures differential privacy of the estimates. Our framework is highly general: it applies to any two-stage CATE meta-learner with a Neyman-orthogonal loss function, and any machine learning model can be used for nuisance estimation. We further provide an extension of our DP-CATE, where we employ RKHS regression to release the complete CATE function while ensuring differential privacy. We demonstrate our DP-CATE across various experiments using synthetic and real-world datasets. To the best of our knowledge, we are the first to provide a framework for CATE estimation that is Neyman-orthogonal and differentially private.
