Table of Contents
Fetching ...

Learning Structural Causal Models from Ordering: Identifiable Flow Models

Minh Khoa Le, Kien Do, Truyen Tran

TL;DR

This work tackles causal inference when only observational data and a valid causal ordering are available. It introduces Identifiable Causal Flow Models (CFMs) that map exogenous variables to endogenous ones via invertible, component-wise transformations, enabling identifiability and parallel learning of all causal mechanisms. The authors design efficient architectures (MAVEN for abduction and an endogenous predictor for parallel prediction) and derive Do-operator implementations that scale linearly with the number of model layers, not with the graph size. Empirical results on synthetic and real data show that P-CFM/S-CFM outperform state-of-the-art baselines in observational, interventional, and counterfactual tasks, with substantial speedups suitable for large causal graphs. The approach demonstrates practical applicability to complex systems, such as fMRI data, while providing theoretical identifiability guarantees under the assumed conditions.

Abstract

In this study, we address causal inference when only observational data and a valid causal ordering from the causal graph are available. We introduce a set of flow models that can recover component-wise, invertible transformation of exogenous variables. Our flow-based methods offer flexible model design while maintaining causal consistency regardless of the number of discretization steps. We propose design improvements that enable simultaneous learning of all causal mechanisms and reduce abduction and prediction complexity to linear O(n) relative to the number of layers, independent of the number of causal variables. Empirically, we demonstrate that our method outperforms previous state-of-the-art approaches and delivers consistent performance across a wide range of structural causal models in answering observational, interventional, and counterfactual questions. Additionally, our method achieves a significant reduction in computational time compared to existing diffusion-based techniques, making it practical for large structural causal models.

Learning Structural Causal Models from Ordering: Identifiable Flow Models

TL;DR

This work tackles causal inference when only observational data and a valid causal ordering are available. It introduces Identifiable Causal Flow Models (CFMs) that map exogenous variables to endogenous ones via invertible, component-wise transformations, enabling identifiability and parallel learning of all causal mechanisms. The authors design efficient architectures (MAVEN for abduction and an endogenous predictor for parallel prediction) and derive Do-operator implementations that scale linearly with the number of model layers, not with the graph size. Empirical results on synthetic and real data show that P-CFM/S-CFM outperform state-of-the-art baselines in observational, interventional, and counterfactual tasks, with substantial speedups suitable for large causal graphs. The approach demonstrates practical applicability to complex systems, such as fMRI data, while providing theoretical identifiability guarantees under the assumed conditions.

Abstract

In this study, we address causal inference when only observational data and a valid causal ordering from the causal graph are available. We introduce a set of flow models that can recover component-wise, invertible transformation of exogenous variables. Our flow-based methods offer flexible model design while maintaining causal consistency regardless of the number of discretization steps. We propose design improvements that enable simultaneous learning of all causal mechanisms and reduce abduction and prediction complexity to linear O(n) relative to the number of layers, independent of the number of causal variables. Empirically, we demonstrate that our method outperforms previous state-of-the-art approaches and delivers consistent performance across a wide range of structural causal models in answering observational, interventional, and counterfactual questions. Additionally, our method achieves a significant reduction in computational time compared to existing diffusion-based techniques, making it practical for large structural causal models.

Paper Structure

This paper contains 41 sections, 3 theorems, 18 equations, 13 figures, 3 tables, 6 algorithms.

Key Result

Theorem 1

Let $f\left(u,t\right)$ be the solution of the set of initial value problems (IVPs) for all nodes at time $t$, with the IVP for node $i$ is described in Eq eq:causal_ivp. If the velocity function $v^{i}\left(z_{t}^{i},u^{<\pi_{i}},t\right)$ is continuous w.r.t. $t$ and Lipschitz continuous w.r.t. $z

Figures (13)

  • Figure 1: Parallel Causal Flow Model (P-CFM) of a simple SCM of 3 nodes (Top), unrolled into abduction (Middle) and prediction processes (Bottom). Green nodes are known, white nodes are to be calculated, and orange nodes are approximated. In abduction, all $\left\{ p\left(z_{t_{i+1}}^{j}|z_{t_{i}}^{j},z_{0}^{<\pi_{j}}\right)\right\} _{j=1}^{d}$ can be calculated with one forward pass. In prediction, we show an example of how $p\left(z_{t_{i}}^{3}|z_{t_{i+1}}^{3},\hat{z}_{0}^{<\pi_{j}}\right)$ can be calculated by approximating $z_{0}^{1},z_{0}^{2}$ using the endogenous predictor $\text{EP}_{\theta}$ given $z_{t_{i+1}}^{1},z_{t_{i+1}}^{2},z_{1}^{1},z_{2}^{1}$. All $\left\{ p\left(z_{t_{i}}^{j}|z_{t_{i+1}}^{j},\hat{z}_{0}^{<\pi_{j}}\right)\right\} _{j=1}^{d}$ can be calculated simultaneously.
  • Figure 2: An illustration of the case where $\hat{f}^{i}\left(u^{i},t\right)$ is not a monotonically increasing function of $u^{i}$ for all $t\in\left(0,1\right)$. Since $\hat{f}^{i}\left(u^{i},t\right)$ is continuous w.r.t. $t$, we can find $\gamma\in(0,t]$ such that $\hat{f}^{i}\left(a,\gamma\right)=\hat{f}^{i}\left(b,\gamma\right)=\zeta$. This leads to two distinct solutions of the IVP starting at $z_{\gamma}^{i}=\zeta$, which contradicts the Picard-Lindelof theorem.
  • Figure 3: Inference time (in seconds) against number of nodes $d$ for 10 samples with 50 discretization steps on Observational, Interventional, and Counterfactual queries, for data generated using linear structural equations, ranging from $5$ to $50$ nodes.
  • Figure 4: Graph structures of synthetic datasets
  • Figure 5: Pair plot of true (blue) and P-CFM predicted (orange) data for the Diamond, Nonlinear dataset. True and predicted observational samples are displayed on the left. True and predicted interventional samples under $\text{do}\left(x^{2}=0.08\right)$ are shown on the right.
  • ...and 8 more figures

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Proposition 1
  • Theorem 2