CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning
Andreas W. M. Sauter, Nicolò Botteghi, Erman Acar, Aske Plaat
TL;DR
CORE formalizes causal discovery with interventions as a partially observable Markov decision process and learns a dual-branch Q-learning policy to jointly identify causal graphs and select informative interventions. It demonstrates strong generalization to unseen structures up to 10 variables and achieves high sample efficiency with per-graph inference times in the millisecond range. The approach outperforms the prior state-of-the-art on structure estimation while highlighting the importance of jointly learning interventions; it also discusses real-world applicability and limitations related to function class and confounding. Overall, CORE represents a scalable, data-efficient framework for active causal discovery that leverages reinforcement learning to plan interventions and reconstruct causal graphs, with potential impact on automated CD in complex domains.
Abstract
Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active approach to learning. This paper presents CORE, a deep reinforcement learning-based approach for causal discovery and intervention planning. CORE learns to sequentially reconstruct causal graphs from data while learning to perform informative interventions. Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures. Furthermore, CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency. All relevant code and supplementary material can be found at https://github.com/sa-and/CORE
