Table of Contents
Fetching ...

Ordering-Based Causal Discovery for Linear and Nonlinear Relations

Zhuopeng Xu, Yujie Li, Cheng Liu, Ning Gui

TL;DR

CaPS introduces a novel identification criterion for topological ordering and incorporates the concept of "parent score" during the post-processing optimization stage, helping to accelerate the pruning process and correct inaccurate predictions in the pruning step.

Abstract

Identifying causal relations from purely observational data typically requires additional assumptions on relations and/or noise. Most current methods restrict their analysis to datasets that are assumed to have pure linear or nonlinear relations, which is often not reflective of real-world datasets that contain a combination of both. This paper presents CaPS, an ordering-based causal discovery algorithm that effectively handles linear and nonlinear relations. CaPS introduces a novel identification criterion for topological ordering and incorporates the concept of "parent score" during the post-processing optimization stage. These scores quantify the strength of the average causal effect, helping to accelerate the pruning process and correct inaccurate predictions in the pruning step. Experimental results demonstrate that our proposed solutions outperform state-of-the-art baselines on synthetic data with varying ratios of linear and nonlinear relations. The results obtained from real-world data also support the competitiveness of CaPS. Code and datasets are available at https://github.com/E2real/CaPS.

Ordering-Based Causal Discovery for Linear and Nonlinear Relations

TL;DR

CaPS introduces a novel identification criterion for topological ordering and incorporates the concept of "parent score" during the post-processing optimization stage, helping to accelerate the pruning process and correct inaccurate predictions in the pruning step.

Abstract

Identifying causal relations from purely observational data typically requires additional assumptions on relations and/or noise. Most current methods restrict their analysis to datasets that are assumed to have pure linear or nonlinear relations, which is often not reflective of real-world datasets that contain a combination of both. This paper presents CaPS, an ordering-based causal discovery algorithm that effectively handles linear and nonlinear relations. CaPS introduces a novel identification criterion for topological ordering and incorporates the concept of "parent score" during the post-processing optimization stage. These scores quantify the strength of the average causal effect, helping to accelerate the pruning process and correct inaccurate predictions in the pruning step. Experimental results demonstrate that our proposed solutions outperform state-of-the-art baselines on synthetic data with varying ratios of linear and nonlinear relations. The results obtained from real-world data also support the competitiveness of CaPS. Code and datasets are available at https://github.com/E2real/CaPS.
Paper Structure (31 sections, 2 theorems, 24 equations, 11 figures, 3 tables, 2 algorithms)

This paper contains 31 sections, 2 theorems, 24 equations, 11 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Let $s(x) = \nabla \log p(x)$ be the score and let $\text{diag}(\cdot)$ be the diagonal elements of the matrix. For any $x_j$ in the causal graph $\mathcal{G}$:

Figures (11)

  • Figure 1: Performance of different solutions under datasets with different linear proportions. Since we don't know whether the real data is linear or nonlinear, it is difficult to choose an effective model. Thus, we need a method that works well in both linear and nonlinear and most possibly mixed cases.
  • Figure 2: Results of SynER1 and SynER4 with different linear proportions, where linear proportion equal to 0.0 means all relations are nonlinear and 1.0 means all relations are linear.
  • Figure 3: F1 score and training time of SynER1 with larger-scale causal graph.
  • Figure 4: Visualization on SynER1 dataset. Darker colors indicate stronger causal effects.
  • Figure 5: Example of the Syntren dataset.
  • ...and 6 more figures

Theorems & Definitions (5)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • proof