Transfer Learning for Causal Effect Estimation
Song Wei, Hanyu Zhang, Ronald Moore, Rishikesan Kamaleswaran, Yao Xie
TL;DR
This work addresses causal-effect estimation with limited target-domain data by proposing Transfer Causal Learning ($\ell_1$-TCL), which transfers nuisance-model information from a related source domain and corrects it via sparsity-driven bias adjustment before plug-in ACE estimation. The method supports GLM nuisance models and NN-based extensions, and provides non-asymptotic guarantees under sparsity, showing favorable performance on synthetic and real data, including a sepsis vasopressor-ACE case where naive baselines fail. Key contributions include a two-stage transfer procedure (rough source estimation plus $\ell_1$ bias correction), theoretical error bounds that separate bias and rough-estimation components, and a generic NN-friendly TCL framework with ParT for CATE-like problems. The results demonstrate the practical impact of principled transfer in causal inference, enabling more reliable decision-making in data-limited medical settings and guiding hyperparameter selection via covariate balance metrics.
Abstract
We present a Transfer Causal Learning (TCL) framework when target and source domains share the same covariate/feature spaces, aiming to improve causal effect estimation accuracy in limited data. Limited data is very common in medical applications, where some rare medical conditions, such as sepsis, are of interest. Our proposed method, named \texttt{$\ell_1$-TCL}, incorporates $\ell_1$ regularized TL for nuisance models (e.g., propensity score model); the TL estimator of the nuisance parameters is plugged into downstream average causal/treatment effect estimators (e.g., inverse probability weighted estimator). We establish non-asymptotic recovery guarantees for the \texttt{$\ell_1$-TCL} with generalized linear model (GLM) under the sparsity assumption in the high-dimensional setting, and demonstrate the empirical benefits of \texttt{$\ell_1$-TCL} through extensive numerical simulation for GLM and recent neural network nuisance models. Our method is subsequently extended to real data and generates meaningful insights consistent with medical literature, a case where all baseline methods fail.
