CausalPFN: Amortized Causal Effect Estimation via In-Context Learning
Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Benson Li, Junwei Ma, Jesse C. Cresswell, Rahul G. Krishnan
TL;DR
CausalPFN introduces a transformer that amortizes causal effect estimation by learning from a large library of simulated DGPs that satisfy the ignorability assumption, producing CEPO-based posterior predictive distributions for new observational data. By training with a causal data-prior loss, a single model $q_\theta$ maps observed data to CEPO-PPDs, enabling zero-shot estimation of CATE and ATE with calibrated uncertainty. The approach delivers state-of-the-art average performance on CATE across IHDP, ACIC, and Lalonde, competitive ATE results, and competitive uplift modeling, while shifting the heavy posterior computation to pre-training. The authors formalize identifiability conditions ensuring consistency and provide calibration mechanisms (e.g., temperature scaling) to address epistemic uncertainty, releasing code and priors to foster adoption in practice.
Abstract
Causal effect estimation from observational data is fundamental across various applications. However, selecting an appropriate estimator from dozens of specialized methods demands substantial manual effort and domain expertise. We present CausalPFN, a single transformer that amortizes this workflow: trained once on a large library of simulated data-generating processes that satisfy ignorability, it infers causal effects for new observational datasets out of the box. CausalPFN combines ideas from Bayesian causal inference with the large-scale training protocol of prior-fitted networks (PFNs), learning to map raw observations directly to causal effects without any task-specific adjustment. Our approach achieves superior average performance on heterogeneous and average treatment effect estimation benchmarks (IHDP, Lalonde, ACIC). Moreover, it shows competitive performance for real-world policy making on uplift modeling tasks. CausalPFN provides calibrated uncertainty estimates to support reliable decision-making based on Bayesian principles. This ready-to-use model requires no further training or tuning and takes a step toward automated causal inference (https://github.com/vdblm/CausalPFN/).
