How Interpretable Are Interpretable Graph Neural Networks?

Yongqiang Chen; Yatao Bian; Bo Han; James Cheng

How Interpretable Are Interpretable Graph Neural Networks?

Yongqiang Chen, Yatao Bian, Bo Han, James Cheng

TL;DR

This work designs a new XGNN architecture called Graph Multilinear neT (GMT), which is provably more powerful in approximating SubMT and demonstrates that GMT outperforms the state-of-the-art up to 10% in terms of both interpretability and generalizability across 12 regular and geometric graph benchmarks.

Abstract

Interpretable graph neural networks (XGNNs ) are widely adopted in various scientific applications involving graph-structured data. Existing XGNNs predominantly adopt the attention-based mechanism to learn edge or node importance for extracting and making predictions with the interpretable subgraph. However, the representational properties and limitations of these methods remain inadequately explored. In this work, we present a theoretical framework that formulates interpretable subgraph learning with the multilinear extension of the subgraph distribution, coined as subgraph multilinear extension (SubMT). Extracting the desired interpretable subgraph requires an accurate approximation of SubMT, yet we find that the existing XGNNs can have a huge gap in fitting SubMT. Consequently, the SubMT approximation failure will lead to the degenerated interpretability of the extracted subgraphs. To mitigate the issue, we design a new XGNN architecture called Graph Multilinear neT (GMT), which is provably more powerful in approximating SubMT. We empirically validate our theoretical findings on a number of graph classification benchmarks. The results demonstrate that GMT outperforms the state-of-the-art up to 10% in terms of both interpretability and generalizability across 12 regular and geometric graph benchmarks.

How Interpretable Are Interpretable Graph Neural Networks?

TL;DR

Abstract

Paper Structure (55 sections, 6 theorems, 58 equations, 21 figures, 12 tables, 3 algorithms)

This paper contains 55 sections, 6 theorems, 58 equations, 21 figures, 12 tables, 3 algorithms.

Introduction
Preliminaries and Related Work
On the Expressivity of Interpretable GNNs
Subgraph multilinear extension
Issues of existing approaches
On the Generalization and Interpretability: A Causal View
Causal model of interpretable GNNs
Causal faithfulness of XGNNs
Building Reliable XGNNs
Linearized GMT
GMT with random subgraph sampling
Learning neural subgraph multilinear extension
Experimental Evaluations
Experimental settings
Experimental results and analysis
...and 40 more sections

Key Result

Proposition 3.3

An XGNN based on linear GNN with $k>1$ cannot satisfy Eq. eq:exp_issue, thus cannot approximate SubMT.

Figures (21)

Figure 1: Illustration of Subgraph Multilinear Extension approximation failure. In a binary graph classification task, XGNNs need to classify whether a graph contains a specific “house” or “cycle” motif into two steps: (a) Subgraph Extraction, a subgraph extractor computes the sampling probability for each edge using the attention mechanism, which further determines the subgraph distribution: part of the “house” or “grid” motif is sampled with a certain probability. The expected subgraph $\widehat{G}_c$ to be sampled according to the subgraph distribution is a “soft” subgraph, with each edge weighting the corresponding sampling probability. (b) Subgraph Classification, each subgraph $G_c^i$ corresponds to a respective label distribution $P(Y|G_c^i)$ (e.g., when some key parts of the “house” motif are sampled, the “house” label probability will be higher). The final label predictions are conditioned on the subgraph distributions, averaged upon each $P(Y|G_c^i)$ with the probability of $G_c^i$ being sampled. The averaged label distribution leads to a prediction of “house” for the example. Instead of averaging the subgraph conditional label predictions, previous methods directly take the expected “soft” subgraph as the input to the subgraph classifier GNN $f_c$, which can be biased and lead to an incorrect prediction of “cycle”.
Figure 2: Comparison of simulated SubMT and GSAT in terms of counterfactual faithfulness.
Figure 3: Ablation studies.
Figure 4: Full SCMs on Graph Distribution Shifts ciga.
Figure 5: Bernoulli Parameterized SCM for interpretable GNN
...and 16 more figures

Theorems & Definitions (15)

Definition 3.1: Subgraph multilinear extension (SubMT)
Definition 3.2: $\epsilon$-SubMT approximation
Proposition 3.3
Definition 4.1: $(\delta,\epsilon)$-counterfactual fidelity
Proposition 4.2
Theorem 5.1
Definition 4.1: Subgraph multilinear extension (SubMT)
Definition 4.2: $\epsilon$-SubMT approximation
Definition 4.3: $(\delta,\epsilon)$-counterfactual fidelity
Proposition 4.4
...and 5 more

How Interpretable Are Interpretable Graph Neural Networks?

TL;DR

Abstract

How Interpretable Are Interpretable Graph Neural Networks?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (15)