A Meta-Learning Approach to Bayesian Causal Discovery

Anish Dhir; Matthew Ashman; James Requeima; Mark van der Wilk

A Meta-Learning Approach to Bayesian Causal Discovery

Anish Dhir, Matthew Ashman, James Requeima, Mark van der Wilk

TL;DR

The paper tackles learning causal structure under uncertainty by proposing BCNP, a Bayesian meta-learning model that maps observational data to a posterior over DAGs and directly samples from this posterior. It uses a transformer-based encoder to extract a permutation-aware representation and a decoder that samples DAGs through a learned distribution over permutations and a lower-triangular edge mask, ensuring acyclicity by construction. The training objective minimizes the KL divergence between the true Bayesian causal posterior and the model’s posterior, effectively marginalizing over functional relationships and enabling efficient posterior sampling. Empirical results show BCNP achieves competitive or superior performance to explicit Bayesian methods and existing meta-learning approaches across synthetic, semi-synthetic, and Syntren data, demonstrating the practicality and robustness of Bayesian meta-learning for causal discovery.

Abstract

Discovering a unique causal structure is difficult due to both inherent identifiability issues, and the consequences of finite data. As such, uncertainty over causal structures, such as those obtained from a Bayesian posterior, are often necessary for downstream tasks. Finding an accurate approximation to this posterior is challenging, due to the large number of possible causal graphs, as well as the difficulty in the subproblem of finding posteriors over the functional relationships of the causal edges. Recent works have used meta-learning to view the problem of estimating the maximum a-posteriori causal graph as supervised learning. Yet, these methods are limited when estimating the full posterior as they fail to encode key properties of the posterior, such as correlation between edges and permutation equivariance with respect to nodes. Further, these methods also cannot reliably sample from the posterior over causal structures. To address these limitations, we propose a Bayesian meta learning model that allows for sampling causal structures from the posterior and encodes these key properties. We compare our meta-Bayesian causal discovery against existing Bayesian causal discovery methods, demonstrating the advantages of directly learning a posterior over causal structure.

A Meta-Learning Approach to Bayesian Causal Discovery

TL;DR

Abstract

A Meta-Learning Approach to Bayesian Causal Discovery

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)