Table of Contents
Fetching ...

Local Causal Discovery with Background Knowledge

Qingyuan Zheng, Yue Liu, Yangbo He

TL;DR

This work develops a local causal discovery framework that leverages background knowledge to refine Markov equivalence classes into a maximally partially directed acyclic graph (MPDAG) and identify causal relations from the local neighborhood of a target variable. It introduces MB-by-MB in MPDAG to learn local structure under direct, non-ancestral, and ancestral knowledge, and derives sufficient and necessary criteria for classifying definite descendants, definite non-descendants, and possible descendants using only local information (including explicit vs. implicit causes). The paper also presents LABITER, a locally focused algorithm that integrates background knowledge to identify causal relations more accurately and efficiently than full-graph methods, with demonstrated improvements in local-structure learning, causal identification, and counterfactual fairness. Applications to fair ML and real data (e.g., Bank Marketing) illustrate practical impact, showing that exploiting prior knowledge can yield more plausible causal inferences and reduced unfairness without sacrificing predictive performance.

Abstract

Causality plays a pivotal role in various fields of study. Based on the framework of causal graphical models, previous works have proposed identifying whether a variable is a cause or non-cause of a target in every Markov equivalent graph solely by learning a local structure. However, the presence of prior knowledge, often represented as a partially known causal graph, is common in many causal modeling applications. Leveraging this prior knowledge allows for the further identification of causal relationships. In this paper, we first propose a method for learning the local structure using all types of causal background knowledge, including direct causal information, non-ancestral information and ancestral information. Then we introduce criteria for identifying causal relationships based solely on the local structure in the presence of prior knowledge. We also apply out method to fair machine learning, and experiments involving local structure learning, causal relationship identification, and fair machine learning demonstrate that our method is both effective and efficient.

Local Causal Discovery with Background Knowledge

TL;DR

This work develops a local causal discovery framework that leverages background knowledge to refine Markov equivalence classes into a maximally partially directed acyclic graph (MPDAG) and identify causal relations from the local neighborhood of a target variable. It introduces MB-by-MB in MPDAG to learn local structure under direct, non-ancestral, and ancestral knowledge, and derives sufficient and necessary criteria for classifying definite descendants, definite non-descendants, and possible descendants using only local information (including explicit vs. implicit causes). The paper also presents LABITER, a locally focused algorithm that integrates background knowledge to identify causal relations more accurately and efficiently than full-graph methods, with demonstrated improvements in local-structure learning, causal identification, and counterfactual fairness. Applications to fair ML and real data (e.g., Bank Marketing) illustrate practical impact, showing that exploiting prior knowledge can yield more plausible causal inferences and reduced unfairness without sacrificing predictive performance.

Abstract

Causality plays a pivotal role in various fields of study. Based on the framework of causal graphical models, previous works have proposed identifying whether a variable is a cause or non-cause of a target in every Markov equivalent graph solely by learning a local structure. However, the presence of prior knowledge, often represented as a partially known causal graph, is common in many causal modeling applications. Leveraging this prior knowledge allows for the further identification of causal relationships. In this paper, we first propose a method for learning the local structure using all types of causal background knowledge, including direct causal information, non-ancestral information and ancestral information. Then we introduce criteria for identifying causal relationships based solely on the local structure in the presence of prior knowledge. We also apply out method to fair machine learning, and experiments involving local structure learning, causal relationship identification, and fair machine learning demonstrate that our method is both effective and efficient.
Paper Structure (28 sections, 34 theorems, 3 equations, 10 figures, 2 tables, 5 algorithms)

This paper contains 28 sections, 34 theorems, 3 equations, 10 figures, 2 tables, 5 algorithms.

Key Result

Lemma 1

(zuo2022counterfactual, Theorem 4.5) Let $X$ and $Y$ be two distinct vertices in an MPDAG $\mathcal{G}^*,$ and $\mathbf{C}$ be the critical set of $X$ with respect to $Y$ in $\mathcal{G}^*.$ Then $Y$ is a definite descendant of $X$ in $\left[\mathcal{G}^*\right]$ if and only if either $\mathbf{C}\ca

Figures (10)

  • Figure 1: An example of three Markov equivalent DAGs $\mathcal{G}_1$-$\mathcal{G}_3$ and their corresponding CPDAG $\mathcal{G}^*.$ These Markov equivalent DAGs have the same edges despite their orientation and share the same v-structure $A\to Y\leftarrow B$. The undirected edges $A-X-B$ in $\mathcal{G}^*$ indicate that these edges may have different orientations in the Markov equivalent DAGs.
  • Figure 2: An example for learning local structure. (a) The original DAG. (b)-(f) Learned structure $G$ defined at Step $3$ in Algorithm \ref{['alg:mb-by-mb in MPDAG']}. $G_{D}^{\mathcal{B}}$ denote the learned graph after running Step $4$-$16$ in Algorithm \ref{['alg:mb-by-mb in MPDAG']} for each node in $D$, with background knowledge $\mathcal{B}$.
  • Figure 3: Ratio of the number of conditional independence test required by MB-by-MB in MPDAG (Algorithm \ref{['alg:mb-by-mb in MPDAG']}) to the comparison method (solid red line: MB-by-MB algorithm, dashed cyan line: PC algorithm) when learning local structure in a chain component.
  • Figure 4: Average SHD between the learned local structure and true MPDAG around the target node. Solid red line: MB-by-MB in MPDAG (Algorithm \ref{['alg:mb-by-mb in MPDAG']}); dashed green line: MB-by-MB algorithm; dotted blue line: PC algorithm.
  • Figure 5: (a)-(d) The Kappa coefficients of different methods on random graphs with $n$ nodes and $N$ samples are drawn. (e)-(h) The average CPU time of different methods.
  • ...and 5 more figures

Theorems & Definitions (65)

  • Lemma 1
  • Theorem 1
  • Theorem 2
  • Example 1
  • Lemma 2
  • Theorem 3
  • Definition 1
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • ...and 55 more