Learning Structural Causal Models from Ordering: Identifiable Flow Models
Minh Khoa Le, Kien Do, Truyen Tran
TL;DR
This work tackles causal inference when only observational data and a valid causal ordering are available. It introduces Identifiable Causal Flow Models (CFMs) that map exogenous variables to endogenous ones via invertible, component-wise transformations, enabling identifiability and parallel learning of all causal mechanisms. The authors design efficient architectures (MAVEN for abduction and an endogenous predictor for parallel prediction) and derive Do-operator implementations that scale linearly with the number of model layers, not with the graph size. Empirical results on synthetic and real data show that P-CFM/S-CFM outperform state-of-the-art baselines in observational, interventional, and counterfactual tasks, with substantial speedups suitable for large causal graphs. The approach demonstrates practical applicability to complex systems, such as fMRI data, while providing theoretical identifiability guarantees under the assumed conditions.
Abstract
In this study, we address causal inference when only observational data and a valid causal ordering from the causal graph are available. We introduce a set of flow models that can recover component-wise, invertible transformation of exogenous variables. Our flow-based methods offer flexible model design while maintaining causal consistency regardless of the number of discretization steps. We propose design improvements that enable simultaneous learning of all causal mechanisms and reduce abduction and prediction complexity to linear O(n) relative to the number of layers, independent of the number of causal variables. Empirically, we demonstrate that our method outperforms previous state-of-the-art approaches and delivers consistent performance across a wide range of structural causal models in answering observational, interventional, and counterfactual questions. Additionally, our method achieves a significant reduction in computational time compared to existing diffusion-based techniques, making it practical for large structural causal models.
