Federated Causal Inference: Multi-Study ATE Estimation beyond Meta-Analysis
Rémi Khellaf, Aurélien Bellet, Julie Josse
TL;DR
The paper tackles multi-study ATE estimation from decentralized RCT data by formulating Federated Causal Inference and comparing three estimator families: Meta-Analysis, One-Shot, and Gradient-based federated methods. It derives the asymptotic variance under a linear outcome model and analyzes performance under homogeneous, covariate-shift, and study-effect scenarios, offering a practical decision diagram for practitioners. The key contributions include characterizing when pooled-data performance is achievable in a federated setting, showing that one-shot IVW and gradient-based approaches can match pooling under favorable conditions, and detailing how to adjust for study-effects to retain unbiasedness. The results have practical impact for regulatory science and collaborative clinical research, enabling robust, privacy-preserving estimation of population ATE across multiple centers with explicit guidance on estimator choice and communication costs, backed by synthetic and semi-synthetic validations.
Abstract
We study Federated Causal Inference, an approach to estimate treatment effects from decentralized data across centers. We compare three classes of Average Treatment Effect (ATE) estimators derived from the Plug-in G-Formula, ranging from simple meta-analysis to one-shot and multi-shot federated learning, the latter leveraging the full data to learn the outcome model (albeit requiring more communication). Focusing on Randomized Controlled Trials (RCTs), we derive the asymptotic variance of these estimators for linear models. Our results provide practical guidance on selecting the appropriate estimator for various scenarios, including heterogeneity in sample sizes, covariate distributions, treatment assignment schemes, and center effects. We validate these findings with a simulation study.
