Table of Contents
Fetching ...

When Graph Neural Network Meets Causality: Opportunities, Methodologies and An Outlook

Wenzhao Jiang, Hao Liu, Hui Xiong

TL;DR

This survey addresses trustworthiness gaps in graph neural networks by integrating causal learning to handle distribution shifts, unfairness, and explainability issues. It introduces a taxonomy of causality-inspired GNNs (CIGNNs) categorized by their causal reasoning or causal representation learning capabilities, and surveys representative methods across group- and individual-level causal effects, counterfactual explanations, and invariant/variant representation learning. The authors compile datasets, evaluation metrics, and open-source codes to standardize assessment and replication, and discuss practical future directions such as scalability, foundation-model integration, and privacy considerations. Overall, the work provides a comprehensive, causality-centered framework for enhancing the reliability and interpretability of graph-based learning in real-world applications.

Abstract

Graph Neural Networks (GNNs) have emerged as powerful representation learning tools for capturing complex dependencies within diverse graph-structured data. Despite their success in a wide range of graph mining tasks, GNNs have raised serious concerns regarding their trustworthiness, including susceptibility to distribution shift, biases towards certain populations, and lack of explainability. Recently, integrating causal learning techniques into GNNs has sparked numerous ground-breaking studies since many GNN trustworthiness issues can be alleviated by capturing the underlying data causality rather than superficial correlations. In this survey, we comprehensively review recent research efforts on Causality-Inspired GNNs (CIGNNs). Specifically, we first employ causal tools to analyze the primary trustworthiness risks of existing GNNs, underscoring the necessity for GNNs to comprehend the causal mechanisms within graph data. Moreover, we introduce a taxonomy of CIGNNs based on the type of causal learning capability they are equipped with, i.e., causal reasoning and causal representation learning. Besides, we systematically introduce typical methods within each category and discuss how they mitigate trustworthiness risks. Finally, we summarize useful resources and discuss several future directions, hoping to shed light on new research opportunities in this emerging field. The representative papers, along with open-source data and codes, are available in https://github.com/usail-hkust/Causality-Inspired-GNNs.

When Graph Neural Network Meets Causality: Opportunities, Methodologies and An Outlook

TL;DR

This survey addresses trustworthiness gaps in graph neural networks by integrating causal learning to handle distribution shifts, unfairness, and explainability issues. It introduces a taxonomy of causality-inspired GNNs (CIGNNs) categorized by their causal reasoning or causal representation learning capabilities, and surveys representative methods across group- and individual-level causal effects, counterfactual explanations, and invariant/variant representation learning. The authors compile datasets, evaluation metrics, and open-source codes to standardize assessment and replication, and discuss practical future directions such as scalability, foundation-model integration, and privacy considerations. Overall, the work provides a comprehensive, causality-centered framework for enhancing the reliability and interpretability of graph-based learning in real-world applications.

Abstract

Graph Neural Networks (GNNs) have emerged as powerful representation learning tools for capturing complex dependencies within diverse graph-structured data. Despite their success in a wide range of graph mining tasks, GNNs have raised serious concerns regarding their trustworthiness, including susceptibility to distribution shift, biases towards certain populations, and lack of explainability. Recently, integrating causal learning techniques into GNNs has sparked numerous ground-breaking studies since many GNN trustworthiness issues can be alleviated by capturing the underlying data causality rather than superficial correlations. In this survey, we comprehensively review recent research efforts on Causality-Inspired GNNs (CIGNNs). Specifically, we first employ causal tools to analyze the primary trustworthiness risks of existing GNNs, underscoring the necessity for GNNs to comprehend the causal mechanisms within graph data. Moreover, we introduce a taxonomy of CIGNNs based on the type of causal learning capability they are equipped with, i.e., causal reasoning and causal representation learning. Besides, we systematically introduce typical methods within each category and discuss how they mitigate trustworthiness risks. Finally, we summarize useful resources and discuss several future directions, hoping to shed light on new research opportunities in this emerging field. The representative papers, along with open-source data and codes, are available in https://github.com/usail-hkust/Causality-Inspired-GNNs.
Paper Structure (44 sections, 2 theorems, 33 equations, 6 figures, 3 tables)

This paper contains 44 sections, 2 theorems, 33 equations, 6 figures, 3 tables.

Key Result

Theorem 1

Given observable variables $C$ that blocks all backdoor paths between $T$ and $Y$, under modularity and positivity assumptions stat2causal2022ICM, we have

Figures (6)

  • Figure 1: Two causal graphs that characterize the graph generation process in graph- or node-level tasks. $Y$ denotes the label or the model prediction of $\mathcal{G}.$$C$ denotes (hidden) confounders. The black solid arrow indicates causal relation and the red dashed arrow indicates spurious correlation. Fig. (a) helps reveal the reasons for GNNs' poor OOD generalizability and explainability, where $V$ and $I$ denote the variant and invariant graph generation factors that causally and non-causally affect $Y,$ respectively. Fig. (b) aids in explaining graph unfairness, where $\mathbf{S}$ and $\mathbf{X}$ denote the sensitive and insensitive graph attributes, respectively.
  • Figure 2: A detailed taxonomy of existing CIGNNs based on their empowered causal learning capability.
  • Figure 3: General pipeline of integrating stable learning into GNNs.
  • Figure 4: An illustrative framework integrating SCM into VGAE.
  • Figure 5: A general pipeline of generating GCEs for a target GNN.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Definition 1: Potential Outcome
  • Definition 2: Structural Causal Model
  • Theorem 1: Backdoor Adjustment
  • Definition 3: OOD Generalization on Graphs oodgraphsurvey2022
  • Definition 4: Graph Counterfactual Fairness gear2022wsdm
  • Definition 5: Post-hoc explainability
  • Definition 6: Inherent Interpretability
  • Definition 7: Frontdoor Criterion
  • Theorem 2: Frontdoor Adjustment
  • Definition 8: Graph Counterfactual Explanation, GCE gce_survey2022