Differentially Private Two-Stage Empirical Risk Minimization and Applications to Individualized Treatment Rule
Joowon Lee, Guanhua Chen
TL;DR
This work addresses end-to-end privacy in two-stage empirical risk minimization where a data-dependent first stage (covariate-balancing weights) feeds into a second-stage weighted ERM. The authors propose DP-2ERM, which preserves first-stage weights non-privately and applies tailored objective perturbation noise to the second stage, enabling end-to-end ${\\epsilon}$-DP or $(\\epsilon,\\delta)$-DP guarantees. They develop deterministic stability bounds for weight perturbations across IPW, MMD, and EBW schemes, along with probabilistic sensitivity bounds for the private estimator, yielding favorable privacy-utility trade-offs compared with standard composition. Substantial theoretical contributions include perturbation bounds, stability results, and a utility analysis showing near-optimal private performance in simulations and real-data ITR applications. Empirically, DP-2ERM with entropy balancing and kernel-based weights demonstrates notable improvements over baselines, underscoring the practical impact for private individualized treatment rule learning and related causal-inference pipelines.
Abstract
Differential Privacy (DP) provides a rigorous framework for deriving privacy-preserving estimators by injecting calibrated noise to mask individual contributions while preserving population-level insights. Its central challenge lies in the privacy-utility trade-off: calibrating noise levels to ensure robust protection without compromising statistical performance. Standard DP methods struggle with a particular class of two-stage problems prevalent in individualized treatment rules (ITRs) and causal inference. In these settings, data-dependent weights are first computed to satisfy distributional constraints, such as covariate balance, before the final parameter of interest is estimated. Current DP approaches often privatize stages independently, which either degrades weight efficacy-leading to biased and inconsistent estimates-or introduces excessive noise to account for worst-case scenarios. To address these challenges, we propose the Differentially Private Two-Stage Empirical Risk Minimization (DP-2ERM), a framework that injects a carefully calibrated noise only into the second stage while maintaining privacy for the entire pipeline and preserving the integrity of the first stage weights. Our theoretical contributions include deterministic bounds on weight perturbations across various widely used weighting methods, and probabilistic bounds on sensitivity for the final estimator. Simulations and real-world applications in ITR demonstrate that DP-2ERM significantly enhances utility over existing methods while providing rigorous privacy guarantees.
