Table of Contents
Fetching ...

A Differential Geometric View and Explainability of GNN on Evolving Graphs

Yazheng Liu, Xi Zhang, Sihong Xie

TL;DR

A smooth parameterization of the GNN predicted distributions using axiomatic attribution, where the distributions are on a low-dimensional manifold within a high-dimensional embedding space within a high-dimensional embedding space is proposed.

Abstract

Graphs are ubiquitous in social networks and biochemistry, where Graph Neural Networks (GNN) are the state-of-the-art models for prediction. Graphs can be evolving and it is vital to formally model and understand how a trained GNN responds to graph evolution. We propose a smooth parameterization of the GNN predicted distributions using axiomatic attribution, where the distributions are on a low-dimensional manifold within a high-dimensional embedding space. We exploit the differential geometric viewpoint to model distributional evolution as smooth curves on the manifold. We reparameterize families of curves on the manifold and design a convex optimization problem to find a unique curve that concisely approximates the distributional evolution for human interpretation. Extensive experiments on node classification, link prediction, and graph classification tasks with evolving graphs demonstrate the better sparsity, faithfulness, and intuitiveness of the proposed method over the state-of-the-art methods.

A Differential Geometric View and Explainability of GNN on Evolving Graphs

TL;DR

A smooth parameterization of the GNN predicted distributions using axiomatic attribution, where the distributions are on a low-dimensional manifold within a high-dimensional embedding space within a high-dimensional embedding space is proposed.

Abstract

Graphs are ubiquitous in social networks and biochemistry, where Graph Neural Networks (GNN) are the state-of-the-art models for prediction. Graphs can be evolving and it is vital to formally model and understand how a trained GNN responds to graph evolution. We propose a smooth parameterization of the GNN predicted distributions using axiomatic attribution, where the distributions are on a low-dimensional manifold within a high-dimensional embedding space. We exploit the differential geometric viewpoint to model distributional evolution as smooth curves on the manifold. We reparameterize families of curves on the manifold and design a convex optimization problem to find a unique curve that concisely approximates the distributional evolution for human interpretation. Extensive experiments on node classification, link prediction, and graph classification tasks with evolving graphs demonstrate the better sparsity, faithfulness, and intuitiveness of the proposed method over the state-of-the-art methods.
Paper Structure (25 sections, 1 theorem, 22 equations, 17 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 1 theorem, 22 equations, 17 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

The GNN-LRP is a special case if the reference activation is set to the empty graph.

Figures (17)

  • Figure 1: $G_0$ at time time $s=0$ is updated to $G_1$ at time $s=1$ after the edge $(J,K)$ is added, and the predicted class distribution ($\mathsf{Pr}(Y|G_0)$) of node $J$ changes accordingly. The contributions of each path $p$ on a computation graph to $\mathsf{Pr}(Y=j|G)$ for class $j$ give the coordinates of $\mathsf{Pr}(Y|G)$ in a high-dimensional Euclidean space, with axes indexed by $(p,j)$. $\mathsf{Pr}(Y|G)$ varies smoothly on a low dimensional manifold, where multiple curves $\gamma(s)$ can explain the evolution from $\mathsf{Pr}(Y|G_0)$ to $\mathsf{Pr}(Y|G_1)$ at very fine-grained. We select a $\gamma(s)$ that use a sparse set of axes for explaining the prediction evolution. Edge deletion, mixture of addition and deletion, link prediction, and graph classification are handled similarly.
  • Figure 2: Performance in KL$^{+}$ as $G_0\to G_1$ on the node classification tasks. Each column is a dataset and each row is one type of evolution.
  • Figure 3: Average ${KL}^{+}$ on the link prediction and graph classification tasks. Each row is a dataset and each column is one evolution setting.
  • Figure 4: Circles in rectangles are neurons, and a neuron has a specific color if it contributes to the prediction change in a class. Left: DeepLIFT finds the contribution of an input neuron to the change in an output neuron of an MLP for link prediction, where the input layer is the output of a GNN. Right: A two-layer GNN. The four colored quadrants in $\Delta z_j$ at the top layer, which can be the input layer to the MLP, can be attributed to the changes in the input neurons at the input layer (e.g., the two blue quadrants at $J$ at the top is attributed to the blue neurons in node $K$ at the input layer through paths $(K, K, J)$ and $(K, J, J)$.
  • Figure 5: Top left: $G_0$ (e.g., a citation network) at time $t=0$ is updated to $G_1$ at time $t=1$ after the edge $(J,K)$ is added and the edge $(I,J)$ is removed, and the logits $\mathbf{z}_J(G_0)$ and predicted class distribution $\mathsf{Pr}_J(G_0)$ of node $J$ changes accordingly. Prior counterfactual methods attribute the change to the edges $(J, K)$ and $(I,J)$. Center left: the GNN computational graph that propagates information from leaves to the root $J$. Top right: Any paths from the computational graph containing a dashed edge contribute to the prediction change, and we axiomatically attribute the logits changes to these paths with contribution $C_{p,j}$ (for the $p$-th path to the component $\Delta z_j$). Center right: Not all paths are significant contributors and we formulate a convex program to uniquely identify a few paths to maximally approximate the changes. Bottom: We show the calculation process of $\textnormal{KL}^{+}$ and $\textnormal{KL}^{-}$ after obtaining $E_n$. Other situations, including edge deletion, mixture of addition and deletion, and link prediction can be reduced to this simple case.
  • ...and 12 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof