Table of Contents
Fetching ...

Foundation Models for Causal Inference via Prior-Data Fitted Networks

Yuchen Ma, Dennis Frauen, Emil Javurek, Stefan Feuerriegel

TL;DR

CausalFM introduces a framework for training tabular foundation models to perform Bayesian causal inference via in-context learning by using SCM-based priors over observational and interventional distributions. It formalizes priors, identifiability guarantees, and a training algorithm that simulates interventional data from SCMs to approximate the conditional PPID $\Pi_{int}(Y\mid\mathcal{D}_n,X=x)$. The approach supports back-door, front-door, and instrumental-variable settings, and demonstrates competitive performance on standard CATE, IV, and front-door benchmarks without dataset-specific retraining. This work offers a scalable, uncertainty-aware, and identifiability-consistent paradigm that could transform causal inference practice across medicine, economics, and beyond.

Abstract

Prior-data fitted networks (PFNs) have recently been proposed as a promising way to train tabular foundation models. PFNs are transformers that are pre-trained on synthetic data generated from a prespecified prior distribution and that enable Bayesian inference through in-context learning. In this paper, we introduce CausalFM, a comprehensive framework for training PFN-based foundation models in various causal inference settings. First, we formalize the construction of Bayesian priors for causal inference based on structural causal models (SCMs) in a principled way and derive necessary criteria for the validity of such priors. Building on this, we propose a novel family of prior distributions using causality-inspired Bayesian neural networks that enable CausalFM to perform Bayesian causal inference in various settings, including for back-door, front-door, and instrumental variable adjustment. Finally, we instantiate CausalFM and explicitly train models to perform in-context learning in these settings. We show that CausalFM achieves competitive in-context learning performance even when compared to baselines that are specifically trained for the task at hand. In sum, our framework can be used as a general recipe to train foundation models for various causal inference settings. In contrast to the current state-of-the-art in causal inference, CausalFM offers a novel paradigm with the potential to fundamentally change how practitioners perform causal inference in medicine, economics, and other disciplines.

Foundation Models for Causal Inference via Prior-Data Fitted Networks

TL;DR

CausalFM introduces a framework for training tabular foundation models to perform Bayesian causal inference via in-context learning by using SCM-based priors over observational and interventional distributions. It formalizes priors, identifiability guarantees, and a training algorithm that simulates interventional data from SCMs to approximate the conditional PPID . The approach supports back-door, front-door, and instrumental-variable settings, and demonstrates competitive performance on standard CATE, IV, and front-door benchmarks without dataset-specific retraining. This work offers a scalable, uncertainty-aware, and identifiability-consistent paradigm that could transform causal inference practice across medicine, economics, and beyond.

Abstract

Prior-data fitted networks (PFNs) have recently been proposed as a promising way to train tabular foundation models. PFNs are transformers that are pre-trained on synthetic data generated from a prespecified prior distribution and that enable Bayesian inference through in-context learning. In this paper, we introduce CausalFM, a comprehensive framework for training PFN-based foundation models in various causal inference settings. First, we formalize the construction of Bayesian priors for causal inference based on structural causal models (SCMs) in a principled way and derive necessary criteria for the validity of such priors. Building on this, we propose a novel family of prior distributions using causality-inspired Bayesian neural networks that enable CausalFM to perform Bayesian causal inference in various settings, including for back-door, front-door, and instrumental variable adjustment. Finally, we instantiate CausalFM and explicitly train models to perform in-context learning in these settings. We show that CausalFM achieves competitive in-context learning performance even when compared to baselines that are specifically trained for the task at hand. In sum, our framework can be used as a general recipe to train foundation models for various causal inference settings. In contrast to the current state-of-the-art in causal inference, CausalFM offers a novel paradigm with the potential to fundamentally change how practitioners perform causal inference in medicine, economics, and other disciplines.

Paper Structure

This paper contains 68 sections, 3 theorems, 55 equations, 1 figure, 8 tables.

Key Result

Theorem 4.3

Let $\mathcal{Z}$ be the set of all identifiability-violating SCMs $\mathcal{S}_0$ that satisfy $\mathbb{P}_\mathrm{obs}^{\mathcal{S}_0}\in\mathcal{P}_\mathrm{obs}$ and $Q\!\bigl(\mathbb{P}_\mathrm{int}^{\mathcal{S}_0}\bigr)\neq\Bar Q\!\bigl(\mathbb{P}_\mathrm{obs}^{\mathcal{S}_0}\bigr)$. Assume tha

Figures (1)

  • Figure 1: C-DAGs compatible with the three example causal inference settings. Yellow variables are observed, blue variables are unobserved, and red variables are clusters of variables.

Theorems & Definitions (12)

  • Definition 3.1
  • Definition 4.1: SCMs Pearl.2009
  • Definition 4.2: $\mathcal{C}$-SCM-Priors
  • Theorem 4.3
  • proof
  • Lemma B.1
  • proof
  • Lemma B.2
  • proof
  • proof : Proof of Theorem \ref{['thrm:identifiability']}
  • ...and 2 more