Table of Contents
Fetching ...

CI-GNN: A Granger Causality-Inspired Graph Neural Network for Interpretable Brain Network-Based Psychiatric Diagnosis

Kaizhong Zheng, Shujian Yu, Badong Chen

TL;DR

CI-GNN tackles the need for interpretable brain-network–based psychiatric diagnosis by integrating a Granger-causality-inspired regularization into a GraphVAE to disentangle causal ($\alpha$) and noncausal ($\beta$) factors. It learns a causal subgraph from $\alpha$ and uses it for end-to-end classification, enforcing causality with $I(\alpha;Y|\beta)$. The approach yields edge-level explanations that are more faithful and clinically plausible than post-hoc methods, achieving state-of-the-art accuracy on synthetic and large multi-site brain datasets. Its demonstrated generalization to molecular graphs suggests broader applicability beyond neuroscience.

Abstract

There is a recent trend to leverage the power of graph neural networks (GNNs) for brain-network based psychiatric diagnosis, which,in turn, also motivates an urgent need for psychiatrists to fully understand the decision behavior of the used GNNs. However, most of the existing GNN explainers are either post-hoc in which another interpretive model needs to be created to explain a well-trained GNN, or do not consider the causal relationship between the extracted explanation and the decision, such that the explanation itself contains spurious correlations and suffers from weak faithfulness. In this work, we propose a granger causality-inspired graph neural network (CI-GNN), a built-in interpretable model that is able to identify the most influential subgraph (i.e., functional connectivity within brain regions) that is causally related to the decision (e.g., major depressive disorder patients or healthy controls), without the training of an auxillary interpretive network. CI-GNN learns disentangled subgraph-level representations α and \b{eta} that encode, respectively, the causal and noncausal aspects of original graph under a graph variational autoencoder framework, regularized by a conditional mutual information (CMI) constraint. We theoretically justify the validity of the CMI regulation in capturing the causal relationship. We also empirically evaluate the performance of CI-GNN against three baseline GNNs and four state-of-the-art GNN explainers on synthetic data and three large-scale brain disease datasets. We observe that CI-GNN achieves the best performance in a wide range of metrics and provides more reliable and concise explanations which have clinical evidence.The source code and implementation details of CI-GNN are freely available at GitHub repository (https://github.com/ZKZ-Brain/CI-GNN/).

CI-GNN: A Granger Causality-Inspired Graph Neural Network for Interpretable Brain Network-Based Psychiatric Diagnosis

TL;DR

CI-GNN tackles the need for interpretable brain-network–based psychiatric diagnosis by integrating a Granger-causality-inspired regularization into a GraphVAE to disentangle causal () and noncausal () factors. It learns a causal subgraph from and uses it for end-to-end classification, enforcing causality with . The approach yields edge-level explanations that are more faithful and clinically plausible than post-hoc methods, achieving state-of-the-art accuracy on synthetic and large multi-site brain datasets. Its demonstrated generalization to molecular graphs suggests broader applicability beyond neuroscience.

Abstract

There is a recent trend to leverage the power of graph neural networks (GNNs) for brain-network based psychiatric diagnosis, which,in turn, also motivates an urgent need for psychiatrists to fully understand the decision behavior of the used GNNs. However, most of the existing GNN explainers are either post-hoc in which another interpretive model needs to be created to explain a well-trained GNN, or do not consider the causal relationship between the extracted explanation and the decision, such that the explanation itself contains spurious correlations and suffers from weak faithfulness. In this work, we propose a granger causality-inspired graph neural network (CI-GNN), a built-in interpretable model that is able to identify the most influential subgraph (i.e., functional connectivity within brain regions) that is causally related to the decision (e.g., major depressive disorder patients or healthy controls), without the training of an auxillary interpretive network. CI-GNN learns disentangled subgraph-level representations α and \b{eta} that encode, respectively, the causal and noncausal aspects of original graph under a graph variational autoencoder framework, regularized by a conditional mutual information (CMI) constraint. We theoretically justify the validity of the CMI regulation in capturing the causal relationship. We also empirically evaluate the performance of CI-GNN against three baseline GNNs and four state-of-the-art GNN explainers on synthetic data and three large-scale brain disease datasets. We observe that CI-GNN achieves the best performance in a wide range of metrics and provides more reliable and concise explanations which have clinical evidence.The source code and implementation details of CI-GNN are freely available at GitHub repository (https://github.com/ZKZ-Brain/CI-GNN/).
Paper Structure (32 sections, 1 theorem, 22 equations, 14 figures, 15 tables, 1 algorithm)

This paper contains 32 sections, 1 theorem, 22 equations, 14 figures, 15 tables, 1 algorithm.

Key Result

Corollary 1

$I(\alpha;Y|\beta)$ is able to measure the causal effect of $\alpha$ on $Y$ when "imposing" $\beta$ in the sense of Granger causality seth2007granger.

Figures (14)

  • Figure 1: The overview of the pipeline for (a) traditional psychiatric diagnostic model and (b) modern diagnostic model with built-in interpretable graph neural networks (e.g., our CI-GNN). The resting-state fMRI data are parcellated by an brain atlas such as the automated anatomical labelling (AAL) atlas and calculated the functional connectivity matrices. For traditional psychiatric diagnostic model, which constitutes a two-stage training strategy, it firstly selects the most informative features using feature selection techniques and then discriminates psychiatric patients and healthy controls using classic classification models on top of selected features. For CI-GNN, the functional connectivity matrices are transferred to functional graphs, which are then sent to CI-GNN to make a decision (i.e., psychiatric patients or healthy controls). Our CI-GNN can also discover the most informative edges, $a.k.a.$, potential biomarker, for each participant.
  • Figure 2: The taxonomy of GNN explanation approaches. The red box is the GNN explanation involved in CI-GNN.
  • Figure 3: (a) Directed acyclic graph reflects the causal-effect relationship between latent factor $\alpha$ and label $Y$, whereas there is a spurious correlation between $\beta$ and $Y$; (b)-(c) Visualization of causal and non-causal subgraphs for House and Cycle motif classification. Here, House and Cycle motifs are causal subgraphs, while Tree motif is non-causal subgraph.
  • Figure 4: The overall architecture of our proposed CI-GNN. The model consists of four modules: GraphVAE, causal effect estimator, causal subgraph generator and a basic classifier $\varphi$. Given an input Graph $G=\{A,X\}$, GraphVAE learns (disentangled) latent factors $Z=[\alpha;\beta]$. The causal effect estimator ensures that only $\alpha$ is causally related to label $Y$ by the conditional mutual information (CMI) regularization $I\left ( \alpha; Y|\beta \right )$. Based on $\alpha$, we introduce another linear decoder $\theta_2$ to generate causal subgraph $\mathcal{G}_{\text{sub}}$, which can then be used for graph classification by classifier $\varphi$.
  • Figure 5: Venn diagram depicting entropy interaction among $\alpha$, $\beta$ and $Y$. $H\left ( \alpha \right )=\left \{ a,e,g,d \right \}$, $H\left ( Y \right )=\left \{ b,e,g,f \right \}$, $H\left ( \beta \right )=\left \{ c,d,g,f \right \}$, $I\left ( \alpha; Y \right )=\left \{ e,g \right \}$, $I\left ( \alpha;\beta \right )=\left \{ d,g \right \}$, $I\left ( Y;\beta \right )=\left \{ g,f \right \}$ and $H\left ( \alpha,Y,\beta \right )=\left \{ a,b,c,d,e,f,g \right \}$
  • ...and 9 more figures

Theorems & Definitions (3)

  • Corollary 1
  • Definition 1
  • Definition 2