Table of Contents
Fetching ...

Federated Causal Inference: Multi-Study ATE Estimation beyond Meta-Analysis

Rémi Khellaf, Aurélien Bellet, Julie Josse

TL;DR

The paper tackles multi-study ATE estimation from decentralized RCT data by formulating Federated Causal Inference and comparing three estimator families: Meta-Analysis, One-Shot, and Gradient-based federated methods. It derives the asymptotic variance under a linear outcome model and analyzes performance under homogeneous, covariate-shift, and study-effect scenarios, offering a practical decision diagram for practitioners. The key contributions include characterizing when pooled-data performance is achievable in a federated setting, showing that one-shot IVW and gradient-based approaches can match pooling under favorable conditions, and detailing how to adjust for study-effects to retain unbiasedness. The results have practical impact for regulatory science and collaborative clinical research, enabling robust, privacy-preserving estimation of population ATE across multiple centers with explicit guidance on estimator choice and communication costs, backed by synthetic and semi-synthetic validations.

Abstract

We study Federated Causal Inference, an approach to estimate treatment effects from decentralized data across centers. We compare three classes of Average Treatment Effect (ATE) estimators derived from the Plug-in G-Formula, ranging from simple meta-analysis to one-shot and multi-shot federated learning, the latter leveraging the full data to learn the outcome model (albeit requiring more communication). Focusing on Randomized Controlled Trials (RCTs), we derive the asymptotic variance of these estimators for linear models. Our results provide practical guidance on selecting the appropriate estimator for various scenarios, including heterogeneity in sample sizes, covariate distributions, treatment assignment schemes, and center effects. We validate these findings with a simulation study.

Federated Causal Inference: Multi-Study ATE Estimation beyond Meta-Analysis

TL;DR

The paper tackles multi-study ATE estimation from decentralized RCT data by formulating Federated Causal Inference and comparing three estimator families: Meta-Analysis, One-Shot, and Gradient-based federated methods. It derives the asymptotic variance under a linear outcome model and analyzes performance under homogeneous, covariate-shift, and study-effect scenarios, offering a practical decision diagram for practitioners. The key contributions include characterizing when pooled-data performance is achievable in a federated setting, showing that one-shot IVW and gradient-based approaches can match pooling under favorable conditions, and detailing how to adjust for study-effects to retain unbiasedness. The results have practical impact for regulatory science and collaborative clinical research, enabling robust, privacy-preserving estimation of population ATE across multiple centers with explicit guidance on estimator choice and communication costs, backed by synthetic and semi-synthetic validations.

Abstract

We study Federated Causal Inference, an approach to estimate treatment effects from decentralized data across centers. We compare three classes of Average Treatment Effect (ATE) estimators derived from the Plug-in G-Formula, ranging from simple meta-analysis to one-shot and multi-shot federated learning, the latter leveraging the full data to learn the outcome model (albeit requiring more communication). Focusing on Randomized Controlled Trials (RCTs), we derive the asymptotic variance of these estimators for linear models. Our results provide practical guidance on selecting the appropriate estimator for various scenarios, including heterogeneity in sample sizes, covariate distributions, treatment assignment schemes, and center effects. We validate these findings with a simulation study.

Paper Structure

This paper contains 46 sections, 5 theorems, 85 equations, 13 figures, 5 tables, 1 algorithm.

Key Result

Proposition 1

$\hat{\tau}_{\mathrm{Meta\text{-}IVW}}$ is the minimum-variance estimator of $\tau$ among the class of aggregation-based estimators.

Figures (13)

  • Figure 1: Graphical Model for Homogeneous Settings
  • Figure 2: Graphical Model in the Heterogeneous Distributions Setting.
  • Figure 3: Graphical model for the heterogeneous study-effects setting.
  • Figure 4: Multi-study ATE Estimation under Homogeneous and Heterogeneity Scenarios
  • Figure 5: Graphical model in fully heterogeneous case.
  • ...and 8 more figures

Theorems & Definitions (16)

  • Definition 1: robins1986new
  • Definition 2: Meta-Analysis - SW Aggregation
  • Definition 3: Meta-Analysis - IVW Aggregation
  • Proposition 1: proof in \ref{['proof:meta_ivw_min_var']}
  • Remark 1
  • Definition 4: SW Federation of $\hat{\theta}_k^{(w)}$
  • Definition 5: IVW Federation of $\hat{\theta}_k^{(w)}$
  • Theorem 1: proof in \ref{['proof:min_var_ivw_thetas']}
  • Definition 6: 1S SW Federation - SW Aggregation
  • Definition 7: 1S IVW Federation - SW Aggregation
  • ...and 6 more