MEGAN: Multi-Explanation Graph Attention Network

Jonas Teufel; Luca Torresi; Patrick Reiser; Pascal Friederich

MEGAN: Multi-Explanation Graph Attention Network

Jonas Teufel, Luca Torresi, Patrick Reiser, Pascal Friederich

TL;DR

This work tackles the challenge of explaining graph predictions, especially for graph regression where explanations must capture opposing motif effects. It introduces MEGAN, a multi channel graph attention network that produces K explanation channels for nodes and edges and jointly trains explanations with the main task through explanation co training. Across synthetic and real world datasets, MEGAN achieves high explanation fidelity and, when trained in an explanation supervised manner, near perfect alignment with ground truth explanations, while maintaining strong predictive performance. The approach enhances interpretability by separating evidence polarity into multiple channels and provides a practical framework for assessing and validating explanations via Fidelity* metrics, with potential to reveal new structure property relationships in graphs.

Abstract

We propose a multi-explanation graph attention network (MEGAN). Unlike existing graph explainability methods, our network can produce node and edge attributional explanations along multiple channels, the number of which is independent of task specifications. This proves crucial to improve the interpretability of graph regression predictions, as explanations can be split into positive and negative evidence w.r.t to a reference value. Additionally, our attention-based network is fully differentiable and explanations can actively be trained in an explanation-supervised manner. We first validate our model on a synthetic graph regression dataset with known ground-truth explanations. Our network outperforms existing baseline explainability methods for the single- as well as the multi-explanation case, achieving near-perfect explanation accuracy during explanation supervision. Finally, we demonstrate our model's capabilities on multiple real-world datasets. We find that our model produces sparse high-fidelity explanations consistent with human intuition about those tasks.

MEGAN: Multi-Explanation Graph Attention Network

TL;DR

Abstract

Paper Structure (27 sections, 13 equations, 6 figures, 3 tables)

This paper contains 27 sections, 13 equations, 6 figures, 3 tables.

Introduction
Related Work
GNN Explanation Methods
Self-Explaining Graph Neural Networks
Explanation Supervision
Multi-Explanation Graph Attention Network
Task Description
Architecture Overview
Explanation Co-Training
Regression
Classification
Multi-Channel Fidelity
Computational Experiments
Synthetic Graph Regression
Single Explanations
...and 12 more sections

Figures (6)

Figure 1: Multi-explanation graph attention network (MEGAN) architecture overview. Rectangle boxes represent layers; arrows indicate layer interconnections. Rounded boxes represent tensors. Intermediate tensors are also named annotated arrows. Tuples beneath variable names indicate the tensor shape, with batch dimension omitted, but implicitly assumed as the first dimension for all.
Figure 2: Illustration of the split training procedure for the regression case. The explanation-only train step attempts to find an approximate solution to the main prediction task, by using only a globally pooled node importance tensor. After the weight update for the explanation step was applied to the model, the prediction step performs another weight update based on the actual output of the model and the ground truth labels.
Figure 3: Examples for explanations generated for one element of the RbMotifs dataset using selected methods. Explanations are represented as bold highlights of the corresponding graph elements. Left: The ground truth explanations split by the polarity of their influence on the graph target value. Middle: Explanations generated by some selected single-explanation methods. Right: Explanations generated by the multi-explanation MEGAN models.
Figure 4: Example explanations generated by MEGAN and GNNExplainer for the prediction of water solubility. Explanations are represented as bold highlights of the corresponding graph elements. Explanations are represented as bold highlights of the corresponding graph elements. (a) Examples of molecules dominated by large carbon structures which are known as negative influences on water solubility. (b) Examples of molecules containing oxygen functional groups which are known to be a positive influence on water solubility. (c) Examples of molecules containing nitrogen groups which are also known as positive influences.
Figure 5: Example explanations obtained from the MEGAN model for the prediction of the singlet-triplet energy gap of the TADF dataset. (a) Explanations that reproduce known chemical intuition about the task. (b) Explanations that reproduce hypotheses previsouly published by Friederich et al. friederich_scientific_2021. (c) New explanatory sub-graph motifs proposed through an observation of the explanations generated by MEGAN.
...and 1 more figures

MEGAN: Multi-Explanation Graph Attention Network

TL;DR

Abstract

MEGAN: Multi-Explanation Graph Attention Network

Authors

TL;DR

Abstract

Table of Contents

Figures (6)