Table of Contents
Fetching ...

Hybrid Local Causal Discovery

Zhaolong Ling, Honghui Peng, Yiwen Zhang, Debo Cheng, Xingyu Wu, Peng Zhou, Kui Yu

TL;DR

This work introduces Hybrid Local Causal Discovery (HLCD), a method that overcomes limitations of purely constraint-based and global score-based local approaches by first building a broad skeleton with OR-rule constraint-based learning and then pruning via local score gains. It further differentiates V-structures from equivalence classes by score-based comparison, thereby guiding orientation without being misled by score equivalence. Theoretical results (Theorems 1–3) establish local score relationships and correctness guarantees, while extensive experiments across 14 benchmark BNs and two real datasets show HLCD outperforms state-of-the-art local causal discovery methods, particularly in small-sample regimes, with favorable time efficiency. The findings suggest HLCD’s hybrid framework effectively leverages score information to refine local causal skeletons and orientations, enabling more reliable local causal inference in practice.

Abstract

Local causal discovery aims to learn and distinguish the direct causes and effects of a target variable from observed data. Existing constraint-based local causal discovery methods use AND or OR rules in constructing the local causal skeleton, but using either rule alone is prone to produce cascading errors in the learned local causal skeleton, and thus impacting the inference of local causal relationships. On the other hand, directly applying score-based global causal discovery methods to local causal discovery may randomly return incorrect results due to the existence of local equivalence classes. To address the above issues, we propose a Hybrid Local Causal Discovery algorithm, called HLCD. Specifically, HLCD initially utilizes a constraint-based approach combined with the OR rule to obtain a candidate skeleton and then employs a score-based method to eliminate redundant portions in the candidate skeleton. Furthermore, during the local causal orientation phase, HLCD distinguishes between V-structures and equivalence classes by comparing the local structure scores between the two, thereby avoiding orientation interference caused by local equivalence classes. We conducted extensive experiments with seven state-of-the-art competitors on 14 benchmark Bayesian network datasets, and the experimental results demonstrate that HLCD significantly outperforms existing local causal discovery algorithms.

Hybrid Local Causal Discovery

TL;DR

This work introduces Hybrid Local Causal Discovery (HLCD), a method that overcomes limitations of purely constraint-based and global score-based local approaches by first building a broad skeleton with OR-rule constraint-based learning and then pruning via local score gains. It further differentiates V-structures from equivalence classes by score-based comparison, thereby guiding orientation without being misled by score equivalence. Theoretical results (Theorems 1–3) establish local score relationships and correctness guarantees, while extensive experiments across 14 benchmark BNs and two real datasets show HLCD outperforms state-of-the-art local causal discovery methods, particularly in small-sample regimes, with favorable time efficiency. The findings suggest HLCD’s hybrid framework effectively leverages score information to refine local causal skeletons and orientations, enabling more reliable local causal inference in practice.

Abstract

Local causal discovery aims to learn and distinguish the direct causes and effects of a target variable from observed data. Existing constraint-based local causal discovery methods use AND or OR rules in constructing the local causal skeleton, but using either rule alone is prone to produce cascading errors in the learned local causal skeleton, and thus impacting the inference of local causal relationships. On the other hand, directly applying score-based global causal discovery methods to local causal discovery may randomly return incorrect results due to the existence of local equivalence classes. To address the above issues, we propose a Hybrid Local Causal Discovery algorithm, called HLCD. Specifically, HLCD initially utilizes a constraint-based approach combined with the OR rule to obtain a candidate skeleton and then employs a score-based method to eliminate redundant portions in the candidate skeleton. Furthermore, during the local causal orientation phase, HLCD distinguishes between V-structures and equivalence classes by comparing the local structure scores between the two, thereby avoiding orientation interference caused by local equivalence classes. We conducted extensive experiments with seven state-of-the-art competitors on 14 benchmark Bayesian network datasets, and the experimental results demonstrate that HLCD significantly outperforms existing local causal discovery algorithms.
Paper Structure (20 sections, 19 equations, 4 figures, 20 tables, 1 algorithm)

This paper contains 20 sections, 19 equations, 4 figures, 20 tables, 1 algorithm.

Figures (4)

  • Figure 1: Directly using the search scoring algorithm to find the maximum score local network structure of node "artco2" will randomly return one of the four local structures in (c). It may depend on the order in which the variables in the dataset are encountered.
  • Figure 2: The experimental results of normalized F1, where the normalized value is the result of the comparison algorithm divided by the result of the HLCD. The larger the normalized F1, the better (the x-axis labels from N1 to N14 represent the Bayesian networks. N1: Alarm. N2: Alarm3. N3: Alarm5. N4: Alarm10. N5: Child. N6: Insurance3. N7: Insurance5. N8: Barley. N9: Hailfinder3. N10: Hailfinder5. N11: Hailfinder10. N12: Link. N13: Pigs. N14: Gene).
  • Figure 3: The experimental results of normalized SHD. The lower the normalized SHD, the better (the x-axis labels from N1 to N14 are identical to those in Figure 2).
  • Figure 4: The identification results of the local causal structure for each node by the HLCD algorithm on the Sachs real network. Blue edges indicate that HLCD correctly identified a parent or child node, while red edges signify unsuccessful identification.