Evaluating Neighbor Explainability for Graph Neural Networks
Oscar Llorente, Rana Fawzy, Jared Keown, Michal Horemuz, Péter Vaderna, Sándor Laki, Roland Kotroczó, Rita Csoma, János Márk Szalai-Gindl
TL;DR
This paper studies neighbor-level explainability for node classification in Graph Neural Networks and introduces four metrics (Loyalty, Inverse Loyalty, Loyalty Probabilities, Inverse Loyalty Probabilities) to evaluate how well explanations identify important neighbors. It reformulates several explainability methods to output per-neighbor importances and compares gradient-based approaches (Saliency Map, Deconvnet, Guided-Backpropagation, SmoothGrad) with graph-specific explainers (GNNExplainer, PGExplainer) across GCN and GAT models on Cora, CiteSeer, and PubMed. The results show gradient-based explanations yield similar performance—unlike typical CV findings—while GNNExplainer is best for highly influential neighbors; PGExplainer generally underperforms, and self-loops critically affect performance, with gradient-based methods faltering when self-loops are absent. These findings guide practitioners in selecting explanations for neighbor-level interpretation in GNNs and suggest avenues for future work to address limitations when self-loops are missing and to understand why gradient-based methods tend to converge toward similar results.
Abstract
Explainability in Graph Neural Networks (GNNs) is a new field growing in the last few years. In this publication we address the problem of determining how important is each neighbor for the GNN when classifying a node and how to measure the performance for this specific task. To do this, various known explainability methods are reformulated to get the neighbor importance and four new metrics are presented. Our results show that there is almost no difference between the explanations provided by gradient-based techniques in the GNN domain. In addition, many explainability techniques failed to identify important neighbors when GNNs without self-loops are used.
