Table of Contents
Fetching ...

Enabling Causal Discovery in Post-Nonlinear Models with Normalizing Flows

Nu Hoang, Bao Duong, Thin Nguyen

TL;DR

This work tackles causal discovery under Post-Nonlinear (PNL) models by enforcing the crucial invertibility constraint through normalizing flows. It introduces CAF-PoNo, which uses a CDF flow to model the invertible function $g^{-1}$ and neural nets for $h$, enabling maximum-likelihood estimation that recovers the noise and facilitates cause-effect identification via HSIC-based independence tests. The framework extends naturally to multivariate causal discovery through a two-stage process of causal ordering identification and edge pruning, with LCIT-based conditional independence tests providing scalable scalability. Empirical results on synthetic and real data show CAF-PoNo achieving state-of-the-art performance in both bivariate and multivariate settings, including faster runtimes and robust performance under model misspecification. Overall, CAF-PoNo offers a principled, scalable, and invertibility-aware approach to causal discovery in PNL models with practical impact for complex causal systems.

Abstract

Post-nonlinear (PNL) causal models stand out as a versatile and adaptable framework for modeling intricate causal relationships. However, accurately capturing the invertibility constraint required in PNL models remains challenging in existing studies. To address this problem, we introduce CAF-PoNo (Causal discovery via Normalizing Flows for Post-Nonlinear models), harnessing the power of the normalizing flows architecture to enforce the crucial invertibility constraint in PNL models. Through normalizing flows, our method precisely reconstructs the hidden noise, which plays a vital role in cause-effect identification through statistical independence testing. Furthermore, the proposed approach exhibits remarkable extensibility, as it can be seamlessly expanded to facilitate multivariate causal discovery via causal order identification, empowering us to efficiently unravel complex causal relationships. Extensive experimental evaluations on both simulated and real datasets consistently demonstrate that the proposed method outperforms several state-of-the-art approaches in both bivariate and multivariate causal discovery tasks.

Enabling Causal Discovery in Post-Nonlinear Models with Normalizing Flows

TL;DR

This work tackles causal discovery under Post-Nonlinear (PNL) models by enforcing the crucial invertibility constraint through normalizing flows. It introduces CAF-PoNo, which uses a CDF flow to model the invertible function and neural nets for , enabling maximum-likelihood estimation that recovers the noise and facilitates cause-effect identification via HSIC-based independence tests. The framework extends naturally to multivariate causal discovery through a two-stage process of causal ordering identification and edge pruning, with LCIT-based conditional independence tests providing scalable scalability. Empirical results on synthetic and real data show CAF-PoNo achieving state-of-the-art performance in both bivariate and multivariate settings, including faster runtimes and robust performance under model misspecification. Overall, CAF-PoNo offers a principled, scalable, and invertibility-aware approach to causal discovery in PNL models with practical impact for complex causal systems.

Abstract

Post-nonlinear (PNL) causal models stand out as a versatile and adaptable framework for modeling intricate causal relationships. However, accurately capturing the invertibility constraint required in PNL models remains challenging in existing studies. To address this problem, we introduce CAF-PoNo (Causal discovery via Normalizing Flows for Post-Nonlinear models), harnessing the power of the normalizing flows architecture to enforce the crucial invertibility constraint in PNL models. Through normalizing flows, our method precisely reconstructs the hidden noise, which plays a vital role in cause-effect identification through statistical independence testing. Furthermore, the proposed approach exhibits remarkable extensibility, as it can be seamlessly expanded to facilitate multivariate causal discovery via causal order identification, empowering us to efficiently unravel complex causal relationships. Extensive experimental evaluations on both simulated and real datasets consistently demonstrate that the proposed method outperforms several state-of-the-art approaches in both bivariate and multivariate causal discovery tasks.
Paper Structure (23 sections, 3 theorems, 16 equations, 4 figures, 4 tables, 3 algorithms)

This paper contains 23 sections, 3 theorems, 16 equations, 4 figures, 4 tables, 3 algorithms.

Key Result

Lemma 1

(Maximum likelihood for causal discovery under PNL models, Theorem 2 of zhang2015estimation). The parameter set $\theta^*$ that maximizes the likelihood $\mathbb{E}[\ln p_\theta(Y\mid X)]$ also minimizes the mutual information of the cause $X$ and the noise $\epsilon_Y$.

Figures (4)

  • Figure 1: Multivariate causal discovery performance on synthetic data as a function of the number of variables. We fix $n=1,000$ and vary the number of variables. The evaluation metrics are $D_{\mathrm{order}}$, SHD, and SID (lower is better). The reported values are aggregated over 10 independent runs. We compare the proposed CAF-PoNo method with RESIT peters2014causaldiscovery, NPVar gao2020apolynomialtime, SCORE rolland2022scorematching, and AbPNLMulti uemura2022amultivariate.
  • Figure 2: Multivariate causal discovery performance on synthetic data as a function of sample size. We fix $d=4$ and vary the sample size. The evaluation metrics are $D_{\mathrm{order}}$, SHD, and SID (lower is better). The reported values are aggregated over 10 independent runs. We compare the proposed CAF-PoNo method with RESIT peters2014causaldiscovery, NPVar gao2020apolynomialtime, SCORE rolland2022scorematching, and AbPNLMulti uemura2022amultivariate.
  • Figure 3: The running time in seconds as a function of the number of variables. The proposed method exhibits a significant reduction in running time compared to AbPNL, demonstrating its potential scalability to high dimensional data.
  • Figure 4: The performance of the proposed method on multivariate causal structure learning with different pruning approaches as a function of the number of variables in terms of SHD (lower is better). The pruning approaches includes the no-pruning approach, the conditional independence test based approach (CI), and the causal additive model (CAM) approach.

Theorems & Definitions (4)

  • Lemma 1
  • Definition 1
  • Proposition 2
  • Proposition 3