Improving the interpretability of GNN predictions through conformal-based graph sparsification

Pablo Sanchez-Martin; Kinaan Aamir Khan; Isabel Valera

Improving the interpretability of GNN predictions through conformal-based graph sparsification

Pablo Sanchez-Martin, Kinaan Aamir Khan, Isabel Valera

TL;DR

CORES tackles the interpretability gap in GNNs by learning a predictive subgraph $\\ackslashmathcal{G}_s$ during training through node/edge removal without assuming subgraph structure. It combines reinforcement learning (policy gradient with PPO) and conformal prediction to guide sparsification with a two-part reward $R = \lambda R_p + (1-\lambda) R_s$ and a conformal-based uncertainty mechanism. The approach solves a bi-level optimization where the inner loop optimizes graph-classification performance on $\\(\\mathcal{G}_s$ while the outer loop learns the sparsification policy, yielding sparser, more interpretable predictions while maintaining competitive accuracy across nine graph datasets. Empirically, CORESN and CORESE achieve significantly sparser predictive subgraphs compared to baselines, with competitive or superior performance, and demonstrate qualitative subgraph motifs aligned with domain knowledge, albeit at the cost of higher training time due to RL components.

Abstract

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that jointly i) finds the most predictive subgraph by removing edges and/or nodes -- -\emph{without making assumptions about the subgraph structure} -- while ii) optimizing the performance of the graph classification task. To that end, we rely on reinforcement learning to solve the resulting bi-level optimization with a reward function based on conformal predictions to account for the current in-training uncertainty of the classifier. Our empirical results on nine different graph classification datasets show that our method competes in performance with baselines while relying on significantly sparser subgraphs, leading to more interpretable GNN-based predictions.

Improving the interpretability of GNN predictions through conformal-based graph sparsification

TL;DR

CORES tackles the interpretability gap in GNNs by learning a predictive subgraph

during training through node/edge removal without assuming subgraph structure. It combines reinforcement learning (policy gradient with PPO) and conformal prediction to guide sparsification with a two-part reward

and a conformal-based uncertainty mechanism. The approach solves a bi-level optimization where the inner loop optimizes graph-classification performance on

while the outer loop learns the sparsification policy, yielding sparser, more interpretable predictions while maintaining competitive accuracy across nine graph datasets. Empirically, CORESN and CORESE achieve significantly sparser predictive subgraphs compared to baselines, with competitive or superior performance, and demonstrate qualitative subgraph motifs aligned with domain knowledge, albeit at the cost of higher training time due to RL components.

Abstract

Paper Structure (38 sections, 6 equations, 7 figures, 12 tables, 1 algorithm)

This paper contains 38 sections, 6 equations, 7 figures, 12 tables, 1 algorithm.

Introduction
Preliminaries
Notation.
Reinforcement learning
Policy Gradient Methods.
Graph neural networks
Conformal Prediction
Related work
Soft removal of information.
Hard removal of information.
CORES : Conformal-based reinforcement learning for graph sparsification
Optimizing for sparsity and performance
Performance optimization.
Sparsity optimization .
Unpacking the sparsity optimization
...and 23 more sections

Figures (7)

Figure 1: CORES pipeline. Illustration of the pipeline of CORES using the synthetic BA2Shapes designed for binary graph classification. On the left, we present two examples of original graphs, denoted as $\mathcal{G}$, corresponding to the positive class (top, cycle motif) and negative class (bottom, house motif). The policy $\phi$ of CORES takes $\mathcal{G}$ and sparsifies it, resulting in the predictive subgraph $\mathcal{G} _s$, which retains only the relevant information for the task, i.e., the motifs. Finally, the graph classifier $\theta$ takes $\mathcal{G} _s$ as input and produces the prediction $y \in \{0, 1\}$.
Figure 2: Evolution of the sparsity reward $\mathrm{R}_s$ over the node/edge ratio for different values of the maximum desired nodes/edges ratio$d$.
Figure 3: Ablation study on the maximum desired ratio $d$. We use the GIN architecture and run each experiment for 5 different folds.
Figure 4: Ablation study on $\lambda$. We use the GIN architecture and run each experiment for 5 different folds.
Figure 5: Illustration of the predictive subgraph obtained by CORESN for a mutagenic (top) and non-mutagenic (bottom) graph of the MUTAG dataset.
...and 2 more figures

Improving the interpretability of GNN predictions through conformal-based graph sparsification

TL;DR

Abstract

Improving the interpretability of GNN predictions through conformal-based graph sparsification

Authors

TL;DR

Abstract

Table of Contents

Figures (7)