Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning
Anish Dhir, Cristiana Diaconu, Valentinian Mihai Lungu, James Requeima, Richard E. Turner, Mark van der Wilk
TL;DR
This work tackles the problem of estimating interventional distributions when the causal graph is uncertain. It introduces MACE-TNP, an end-to-end Transformer Neural Process that directly maps observational data to Bayesian model-averaged interventional distributions, thereby bypassing expensive intermediate posteriors. Empirical results show convergence to the analytic posterior in identifiable two-node cases, correct handling of non-identifiability with interventional data, and superior performance over strong Bayesian baselines across increasingly complex and high-dimensional settings, including real data from Sachs. The approach demonstrates the potential of meta-learning for scalable causal inference under uncertainty, while highlighting trade-offs in compute and the importance of training-data coverage for generalization.
Abstract
In scientific domains -- from biology to the social sciences -- many questions boil down to \textit{What effect will we observe if we intervene on a particular variable?} If the causal relationships (e.g.~a causal graph) are known, it is possible to estimate the intervention distributions. In the absence of this domain knowledge, the causal structure must be discovered from the available observational data. However, observational data are often compatible with multiple causal graphs, making methods that commit to a single structure prone to overconfidence. A principled way to manage this structural uncertainty is via Bayesian inference, which averages over a posterior distribution on possible causal structures and functional mechanisms. Unfortunately, the number of causal structures grows super-exponentially with the number of nodes in the graph, making computations intractable. We propose to circumvent these challenges by using meta-learning to create an end-to-end model: the Model-Averaged Causal Estimation Transformer Neural Process (MACE-TNP). The model is trained to predict the Bayesian model-averaged interventional posterior distribution, and its end-to-end nature bypasses the need for expensive calculations. Empirically, we demonstrate that MACE-TNP outperforms strong Bayesian baselines. Our work establishes meta-learning as a flexible and scalable paradigm for approximating complex Bayesian causal inference, that can be scaled to increasingly challenging settings in the future.
