Faithful Interpretation for Graph Neural Networks
Lijie Hu, Tianhao Huang, Lu Yu, Wanyu Lin, Tianhang Zheng, Di Wang
TL;DR
Attention-based GNNs offer interpretable explanations via attention weights but can provide unstable interpretations under graph perturbations. The authors propose Faithful Graph Attention-based Interpretation (FGAI), a minimax-based framework enforced by four fidelity-stability properties that link interpretability overlap, interpretability stability, prediction closeness, and prediction stability, all quantified with graph-specific metrics. A differentiable surrogate for top-$k$ overlap and a robust objective with perturbation-aware terms yield an efficient algorithm to compute a faithful attention $ ilde{w}$ that remains aligned with vanilla attention. Empirical results across multiple datasets show that FGAI improves interpretation stability and faithfulness under perturbations while preserving predictive performance, achieving better g-JSD, g-TVD, and F-slope profiles with manageable computational cost.
Abstract
Currently, attention mechanisms have garnered increasing attention in Graph Neural Networks (GNNs), such as Graph Attention Networks (GATs) and Graph Transformers (GTs). It is not only due to the commendable boost in performance they offer but also its capacity to provide a more lucid rationale for model behaviors, which are often viewed as inscrutable. However, Attention-based GNNs have demonstrated instability in interpretability when subjected to various sources of perturbations during both training and testing phases, including factors like additional edges or nodes. In this paper, we propose a solution to this problem by introducing a novel notion called Faithful Graph Attention-based Interpretation (FGAI). In particular, FGAI has four crucial properties regarding stability and sensitivity to interpretation and final output distribution. Built upon this notion, we propose an efficient methodology for obtaining FGAI, which can be viewed as an ad hoc modification to the canonical Attention-based GNNs. To validate our proposed solution, we introduce two novel metrics tailored for graph interpretation assessment. Experimental results demonstrate that FGAI exhibits superior stability and preserves the interpretability of attention under various forms of perturbations and randomness, which makes FGAI a more faithful and reliable explanation tool.
