Table of Contents
Fetching ...

Explainability Techniques for Graph Convolutional Networks

Federico Baldassarre, Hossein Azizpour

TL;DR

This work investigates explainability for Graph Networks by contrasting gradient-based (Sensitivity Analysis, Guided Backpropagation) and decomposition-based (Layer-wise Relevance Propagation) methods on a toy infection task and a solubility regression task. It provides a PyTorch autograd-based implementation to enable GN-level explanations and analyzes how network components like connections, pooling, and heterogeneous features affect interpretability. The results indicate LRP yields more intuitive explanations in graph contexts and highlight practical considerations for evaluating explanations via perturbations. Overall, the paper establishes foundational GN-specific explainability approaches and points to future directions for applying these methods to real-world graph problems.

Abstract

Graph Networks are used to make decisions in potentially complex scenarios but it is usually not obvious how or why they made them. In this work, we study the explainability of Graph Network decisions using two main classes of techniques, gradient-based and decomposition-based, on a toy dataset and a chemistry task. Our study sets the ground for future development as well as application to real-world problems.

Explainability Techniques for Graph Convolutional Networks

TL;DR

This work investigates explainability for Graph Networks by contrasting gradient-based (Sensitivity Analysis, Guided Backpropagation) and decomposition-based (Layer-wise Relevance Propagation) methods on a toy infection task and a solubility regression task. It provides a PyTorch autograd-based implementation to enable GN-level explanations and analyzes how network components like connections, pooling, and heterogeneous features affect interpretability. The results indicate LRP yields more intuitive explanations in graph contexts and highlight practical considerations for evaluating explanations via perturbations. Overall, the paper establishes foundational GN-specific explainability approaches and points to future directions for applying these methods to real-world graph problems.

Abstract

Graph Networks are used to make decisions in potentially complex scenarios but it is usually not obvious how or why they made them. In this work, we study the explainability of Graph Network decisions using two main classes of techniques, gradient-based and decomposition-based, on a toy dataset and a chemistry task. Our study sets the ground for future development as well as application to real-world problems.

Paper Structure

This paper contains 28 sections, 3 equations, 19 figures.

Figures (19)

  • Figure 1: Explanation for the solubility of sucrose:the prediction is decomposed into positive (red) and negative (blue) contributions attributed to the atoms using Layer-wise Relevance Propagation.
  • Figure 2: Explaining why node 2 becomes infected. SA places high relevance on the node itself (if 2 was more sick at the beginning, it would be more infected at the end). GBP correctly identifies node 1 as a source of infection, but very small importance is given to the edge. LRP decomposes the prediction into a negative contribution (blue, node 2 is not sick), and two positive contributions (red, node 1 is sick and $1\to 2$ is not virtual). Node 4 is ignored due to max pooling.
  • Figure 3: Predicted solubility (log mol/L) and its explanation produced with LRP.Positive relevance (red) are on R-OH groups, indicating their positive contribution to the predicted value. Negative relevance (blue) can be found on central carbons and non-polar aromatic rings, indicating they advocate towards lower solubility values. See the appendix for a breakdown of the explanation onto the individual features.
  • Figure 4: Node A is responsible for a prediction on B.Even if the input graph does not have features on the edges (represented as dashed lines), the relevant path $A \to B$ is identified by aggregating the relevance at multiple steps of graph convolution.
  • Figure 5: Propagation rules for max pooling.The two top nodes are important, the $N$ bottom ones are not. Relevance can be naïvely propagated from the central node to only one of the top nodes. Approximating max pooling as a $L_p$-norm would result in a more complete explanation, but for large $N$ relevance could disperse to unimportant nodes. A search-based method identifies and propagates relevance only to the relevant nodes.
  • ...and 14 more figures