Incorporating Retrieval-based Causal Learning with Information Bottlenecks for Interpretable Graph Neural Networks

Jiahua Rao; Jiancong Xie; Hanjing Lin; Shuangjia Zheng; Zhen Wang; Yuedong Yang

Incorporating Retrieval-based Causal Learning with Information Bottlenecks for Interpretable Graph Neural Networks

Jiahua Rao, Jiancong Xie, Hanjing Lin, Shuangjia Zheng, Zhen Wang, Yuedong Yang

TL;DR

This work tackles the interpretability of Graph Neural Networks by proposing RC-GNN, a framework that unifies explanation and prediction through retrieval-based causal learning embedded within Graph Information Bottleneck. It introduces a semi-parametric architecture with a subgraph retrieval module to identify informative explanations and a causal graph learning module that compresses these explanations via backdoor-adjusted, contrastive learning. Theoretical framing connects $I(Y; \mathcal{G}_S)$ and $I(\mathcal{G}, \mathcal{G}_S)$ to ensure both informative explanations and minimal input leakage, while empirical results show substantial improvements in explanation quality (Recall/Precision@5) and graph classification accuracy across multiple datasets, including challenging molecular benchmarks. Overall, RC-GNN advances explainable AI for graphs by tightly integrating retrieval, causality, and information bottleneck, with demonstrated gains in both interpretability and predictive performance, suggesting practical impact for high-stakes graph tasks. In particular, the framework optimizes $I(Y; \mathcal{G}_S) - \beta I(\mathcal{G}; \mathcal{G}_S)$ and employs backdoor adjustment to minimize information from the input graph, all while leveraging counterfactual contrastive learning to stabilize causal explanations.$

Abstract

Graph Neural Networks (GNNs) have gained considerable traction for their capability to effectively process topological data, yet their interpretability remains a critical concern. Current interpretation methods are dominated by post-hoc explanations to provide a transparent and intuitive understanding of GNNs. However, they have limited performance in interpreting complicated subgraphs and can't utilize the explanation to advance GNN predictions. On the other hand, transparent GNN models are proposed to capture critical subgraphs. While such methods could improve GNN predictions, they usually don't perform well on explanations. Thus, it is desired for a new strategy to better couple GNN explanation and prediction. In this study, we have developed a novel interpretable causal GNN framework that incorporates retrieval-based causal learning with Graph Information Bottleneck (GIB) theory. The framework could semi-parametrically retrieve crucial subgraphs detected by GIB and compress the explanatory subgraphs via a causal module. The framework was demonstrated to consistently outperform state-of-the-art methods, and to achieve 32.71\% higher precision on real-world explanation scenarios with diverse explanation types. More importantly, the learned explanations were shown able to also improve GNN prediction performance.

Incorporating Retrieval-based Causal Learning with Information Bottlenecks for Interpretable Graph Neural Networks

TL;DR

and

to ensure both informative explanations and minimal input leakage, while empirical results show substantial improvements in explanation quality (Recall/Precision@5) and graph classification accuracy across multiple datasets, including challenging molecular benchmarks. Overall, RC-GNN advances explainable AI for graphs by tightly integrating retrieval, causality, and information bottleneck, with demonstrated gains in both interpretability and predictive performance, suggesting practical impact for high-stakes graph tasks. In particular, the framework optimizes

and employs backdoor adjustment to minimize information from the input graph, all while leveraging counterfactual contrastive learning to stabilize causal explanations.$

Abstract

Paper Structure (40 sections, 20 equations, 4 figures, 7 tables, 1 algorithm)

This paper contains 40 sections, 20 equations, 4 figures, 7 tables, 1 algorithm.

Introduction
Related Works
Explainability of GNNs
Interpretable Graph Learning
Preliminaries
Explanations for GNNs
Graph Information Bottleneck
Causal View on GNNs
Theoretical Analysis
Maximizing $I (Y, \mathcal{G}_{S})$
Minimizing $I(\mathcal{G}, \mathcal{G}_{S})$
Methodology
Model Architecture
Subgraph Retrieval
Causal Graph Learning
...and 25 more sections

Figures (4)

Figure 1: The overall framework of our method.
Figure 2: Explanatory subgraphs for MUTAG and CYP3A4.
Figure 3: Visualization of $H_{\mathcal{G}_c}$, $H_{\mathcal{G}_t}$ and $H_{\mathcal{G}}$ on MUTAG and MutagenicityV2 by t-SNE van2008visualizing with colors labeled by the graph labels.
Figure 4: More examples of Explanatory subgraphs in MUTAG.

Theorems & Definitions (2)

Definition 3.1
Definition 3.2

Incorporating Retrieval-based Causal Learning with Information Bottlenecks for Interpretable Graph Neural Networks

TL;DR

Abstract

Incorporating Retrieval-based Causal Learning with Information Bottlenecks for Interpretable Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (2)