On Discprecncies between Perturbation Evaluations of Graph Neural Network Attributions
Razieh Rezaei, Alireza Dizaji, Ashkan Khakzar, Anees Kazi, Nassir Navab, Daniel Rueckert
TL;DR
This paper tackles the problem of evaluating graph neural network explanations, where existing attribution methods disagree and lack a unifying benchmark. It introduces a graph-domain retraining evaluation framework, adapting the ROAR idea into RoMie (retrain on the most important edges) and RoLie (retrain on the least important edges) to test how well explanations identify edges that truly drive predictions. The study systematically analyzes four state-of-the-art explainers (GradCAM, GNNExplainer, PGExplainer, SubgraphX) across five datasets and two architectures (GCN, GIN), revealing high variability by dataset and network and showing that GNNExplainer often behaves similarly to random guidance rather than capturing robust, generalizable edge importance. The authors argue that retraining evaluation should be used as a problem-specific toolset rather than a universal benchmark and provide practical guidelines for interpreting RoMie/RoLie results, including the treatment of isolated nodes and considerations for out-of-distribution effects. Overall, the work emphasizes careful, dataset- and network-aware evaluation of graph explanations to avoid overgeneralizing attribution quality and to better inform practitioners about which explanations to trust in a given setting.
Abstract
Neural networks are increasingly finding their way into the realm of graphs and modeling relationships between features. Concurrently graph neural network explanation approaches are being invented to uncover relationships between the nodes of the graphs. However, there is a disparity between the existing attribution methods, and it is unclear which attribution to trust. Therefore research has introduced evaluation experiments that assess them from different perspectives. In this work, we assess attribution methods from a perspective not previously explored in the graph domain: retraining. The core idea is to retrain the network on important (or not important) relationships as identified by the attributions and evaluate how networks can generalize based on these relationships. We reformulate the retraining framework to sidestep issues lurking in the previous formulation and propose guidelines for correct analysis. We run our analysis on four state-of-the-art GNN attribution methods and five synthetic and real-world graph classification datasets. The analysis reveals that attributions perform variably depending on the dataset and the network. Most importantly, we observe that the famous GNNExplainer performs similarly to an arbitrary designation of edge importance. The study concludes that the retraining evaluation cannot be used as a generalized benchmark and recommends it as a toolset to evaluate attributions on a specifically addressed network, dataset, and sparsity.
