Higher-Order Causal Structure Learning with Additive Models
James Enouen, Yujia Zheng, Ignavier Ng, Yan Liu, Kun Zhang
TL;DR
This work tackles causal structure learning with higher-order interactions by extending the causal additive model (CAM) to directed hypergraphs (HDAGs). It develops three hypergraph generalizations (undirected, directed classical, and directed additive noise), establishes identifiability results (hyper Markov equivalence and multi-dependence tests), and introduces HCAM, a greedy algorithm that combines CAM with SIAN-based hyperedge discovery. Empirical results on synthetic ANMs demonstrate that modeling higher-order interactions can yield improvements in some settings (notably 2D), while revealing substantial challenges in 3D due to sample complexity. The framework provides a principled path to capturing complex causal mechanisms beyond pairwise interactions and points to future work on latent confounding and broader identifiability in hypergraph-based causal models.
Abstract
Causal structure learning has long been the central task of inferring causal insights from data. Despite the abundance of real-world processes exhibiting higher-order mechanisms, however, an explicit treatment of interactions in causal discovery has received little attention. In this work, we focus on extending the causal additive model (CAM) to additive models with higher-order interactions. This second level of modularity we introduce to the structure learning problem is most easily represented by a directed acyclic hypergraph which extends the DAG. We introduce the necessary definitions and theoretical tools to handle the novel structure we introduce and then provide identifiability results for the hyper DAG, extending the typical Markov equivalence classes. We next provide insights into why learning the more complex hypergraph structure may actually lead to better empirical results. In particular, more restrictive assumptions like CAM correspond to easier-to-learn hyper DAGs and better finite sample complexity. We finally develop an extension of the greedy CAM algorithm which can handle the more complex hyper DAG search space and demonstrate its empirical usefulness in synthetic experiments.
