Table of Contents
Fetching ...

Beyond Edge Deletion: A Comprehensive Approach to Counterfactual Explanation in Graph Neural Networks

Matteo De Sanctis, Riccardo De Sanctis, Stefano Faralli, Paola Velardi, Bardh Prenkaj

TL;DR

This work introduces XPlore, a novel technique that significantly broadens the counterfactual search space and introduces a cosine similarity metric for learned graph embeddings that addresses a key limitation of traditional distance-based metrics, and demonstrates that XPlore produces more coherent and minimal counterfactuals.

Abstract

Graph Neural Networks (GNNs) are increasingly adopted across domains such as molecular biology and social network analysis, yet their black-box nature hinders interpretability and trust. This is especially problematic in high-stakes applications, such as predicting molecule toxicity, drug discovery, or guiding financial fraud detections, where transparent explanations are essential. Counterfactual explanations - minimal changes that flip a model's prediction - offer a transparent lens into GNNs' behavior. In this work, we introduce XPlore, a novel technique that significantly broadens the counterfactual search space. It consists of gradient-guided perturbations to adjacency and node feature matrices. Unlike most prior methods, which focus solely on edge deletions, our approach belongs to the growing class of techniques that optimize edge insertions and node-feature perturbations, here jointly performed under a unified gradient-based framework, enabling a richer and more nuanced exploration of counterfactuals. To quantify both structural and semantic fidelity, we introduce a cosine similarity metric for learned graph embeddings that addresses a key limitation of traditional distance-based metrics, and demonstrate that XPlore produces more coherent and minimal counterfactuals. Empirical results on 13 real-world and 5 synthetic benchmarks show up to +56.3% improvement in validity and +52.8% in fidelity over state-of-the-art baselines, while retaining competitive runtime.

Beyond Edge Deletion: A Comprehensive Approach to Counterfactual Explanation in Graph Neural Networks

TL;DR

This work introduces XPlore, a novel technique that significantly broadens the counterfactual search space and introduces a cosine similarity metric for learned graph embeddings that addresses a key limitation of traditional distance-based metrics, and demonstrates that XPlore produces more coherent and minimal counterfactuals.

Abstract

Graph Neural Networks (GNNs) are increasingly adopted across domains such as molecular biology and social network analysis, yet their black-box nature hinders interpretability and trust. This is especially problematic in high-stakes applications, such as predicting molecule toxicity, drug discovery, or guiding financial fraud detections, where transparent explanations are essential. Counterfactual explanations - minimal changes that flip a model's prediction - offer a transparent lens into GNNs' behavior. In this work, we introduce XPlore, a novel technique that significantly broadens the counterfactual search space. It consists of gradient-guided perturbations to adjacency and node feature matrices. Unlike most prior methods, which focus solely on edge deletions, our approach belongs to the growing class of techniques that optimize edge insertions and node-feature perturbations, here jointly performed under a unified gradient-based framework, enabling a richer and more nuanced exploration of counterfactuals. To quantify both structural and semantic fidelity, we introduce a cosine similarity metric for learned graph embeddings that addresses a key limitation of traditional distance-based metrics, and demonstrate that XPlore produces more coherent and minimal counterfactuals. Empirical results on 13 real-world and 5 synthetic benchmarks show up to +56.3% improvement in validity and +52.8% in fidelity over state-of-the-art baselines, while retaining competitive runtime.
Paper Structure (43 sections, 2 theorems, 32 equations, 10 figures, 8 tables, 1 algorithm)

This paper contains 43 sections, 2 theorems, 32 equations, 10 figures, 8 tables, 1 algorithm.

Key Result

Lemma 1

Assume $L_\mathrm{pred}$ is bounded below and set $\Delta = L_\mathrm{pred} - \mathrm{inf}_\Theta L_\mathrm{pred}$, the difference of $L_\mathrm{pred}$ with the infimum (greatest lower bound) of the prediction loss $L_\mathrm{pred}$ over all feasible $\theta\in \Theta = [0,1]^{n\times n}\times\mathb where $\theta_0$ is the original unperturbed $(P,N)$-vector.

Figures (10)

  • Figure 1: (left) original graph $G$ predicted mutagenic. (right) Counterfactual $G'$ predicted non-mutagenic with highlighted edge additions and removals, as well as potentially node additions to flip the prediction. Note that, although in this example, we illustrate valid chemical properties (see that the valence of atoms is respected), in practice, this might not happen as our method is domain-independent.
  • Figure 2: Illustration of counterfactual explanations in graph classification. a and d: The class changes only after an edge deletion (red, dashed). b and e: the predicted class changes only after an edge addition (green). c and f: Feature perturbation of the carbon-group illustrates benzene converting from neutral to cationic by electron loss. XPlore enables these types of perturbations, alone or in combination, offering a broader search space than edge deletion-only methods.
  • Figure 3: Metrics comparison over TCR dataset for different values of hyperparameter $\gamma$.
  • Figure 4: Validity and Fidelity results for node explanation.
  • Figure 5: t-SNE projection of Wavelet Characteristic embeddings for TCR, comparing CFs generated by CF-GNNExpl, XPlore, and RSGG for the Tree motif. CF-GNNExpl and RSGG find a close CF but fail to land in the Cycle distribution, while XPlore achieves this correctly.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Lemma 1
  • Lemma 2