Learning Neural Causal Models from Unknown Interventions
Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Bernhard Schölkopf, Michael C. Mozer, Chris Pal, Yoshua Bengio
TL;DR
This work tackles the identifiability gap in causal structure learning by proposing a neural, continuous-optimization framework (SDI) that integrates observational and interventional data, even when the intervention targets are unknown. SDI operates in three phases: fitting conditional mechanisms on observational data, scoring candidate graphs against interventional data with a target-prediction heuristic, and crediting structure updates via a REINFORCE-like gradient with an acyclicity regularizer. Across synthetic and real-world (BnLearn) datasets, SDI robustly recovers true graphs, generalizes to unseen interventions, and scales to partially known graphs, outperforming several baselines. The approach advances practical causal discovery in settings where interventions are sparse, uncertain, or partially observed, with broad implications for biology and related sciences.
Abstract
Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data. However, there are theoretical limitations on the identifiability of underlying structures obtained from observational data alone. Interventional data provides much richer information about the underlying data-generating process. However, the extension and application of methods designed for observational data to include interventions is not straightforward and remains an open problem. In this paper we provide a general framework based on continuous optimization and neural networks to create models for the combination of observational and interventional data. The proposed method is even applicable in the challenging and realistic case that the identity of the intervened upon variable is unknown. We examine the proposed method in the setting of graph recovery both de novo and from a partially-known edge set. We establish strong benchmark results on several structure learning tasks, including structure recovery of both synthetic graphs as well as standard graphs from the Bayesian Network Repository.
