Explaining Graph Neural Networks for Node Similarity on Graphs
Daniel Daza, Cuong Xuan Chu, Trung-Kien Tran, Daria Stepanova, Michael Cochez, Paul Groth
TL;DR
The paper addresses explainability for node similarity predicted by graph neural networks on graphs, formalizing similarity via cosine on node embeddings $y(i,j) = \frac{\mathbf{z}_i^{\top}\mathbf{z}_j}{\|\mathbf{z}_i\|\|\mathbf{z}_j\|}$ learned in an unsupervised manner. It compares two explanation paradigms—mutual-information-based perturbation explanations and gradient-based explanations—finding that gradient-based explanations offer actionable, consistent, and sparse insights for similarity tasks. Through extensive experiments on six datasets using multiple unsupervised GNNs (GAE, VGAE, DGI, GRACE), the authors show that gradient-based explanations yield stable fidelity and low effect overlap, while MI explanations lack such consistency. The findings provide practical guidance for building explainable similarity systems and demonstrate that sparse gradient-based explanations can ground edge-level interventions without sacrificing explanatory power.
Abstract
Similarity search is a fundamental task for exploiting information in various applications dealing with graph data, such as citation networks or knowledge graphs. While this task has been intensively approached from heuristics to graph embeddings and graph neural networks (GNNs), providing explanations for similarity has received less attention. In this work we are concerned with explainable similarity search over graphs, by investigating how GNN-based methods for computing node similarities can be augmented with explanations. Specifically, we evaluate the performance of two prominent approaches towards explanations in GNNs, based on the concepts of mutual information (MI), and gradient-based explanations (GB). We discuss their suitability and empirically validate the properties of their explanations over different popular graph benchmarks. We find that unlike MI explanations, gradient-based explanations have three desirable properties. First, they are actionable: selecting inputs depending on them results in predictable changes in similarity scores. Second, they are consistent: the effect of selecting certain inputs overlaps very little with the effect of discarding them. Third, they can be pruned significantly to obtain sparse explanations that retain the effect on similarity scores.
