Table of Contents
Fetching ...

On GNN explanability with activation rules

Luca Veyrin-Forrer, Ataollah Kamal, Stefan Duffner, Marc Plantevit, Céline Robardet

TL;DR

The paper tackles explainability for graph neural networks by mining activation rules across hidden layers to reveal internal representations used for graph classification. It introduces INSIDE-GNN, which constructs per-layer activation matrices and uses a background model with subjective interestingness to enumerate non-redundant activation rules that cover the data. The method yields instance-level explanations and high-level insights into hidden features, and the authors compare favorably against state-of-the-art explainability methods, reporting up to 200% fidelity gains. Pattern languages for summarizing rules (numerical subgroups, discriminant subgraphs) enable interpretable descriptions of what the GNN learns across layers, supporting knowledge discovery.

Abstract

GNNs are powerful models based on node representation learning that perform particularly well in many machine learning problems related to graphs. The major obstacle to the deployment of GNNs is mostly a problem of societal acceptability and trustworthiness, properties which require making explicit the internal functioning of such models. Here, we propose to mine activation rules in the hidden layers to understand how the GNNs perceive the world. The problem is not to discover activation rules that are individually highly discriminating for an output of the model. Instead, the challenge is to provide a small set of rules that cover all input graphs. To this end, we introduce the subjective activation pattern domain. We define an effective and principled algorithm to enumerate activations rules in each hidden layer. The proposed approach for quantifying the interest of these rules is rooted in information theory and is able to account for background knowledge on the input graph data. The activation rules can then be redescribed thanks to pattern languages involving interpretable features. We show that the activation rules provide insights on the characteristics used by the GNN to classify the graphs. Especially, this allows to identify the hidden features built by the GNN through its different layers. Also, these rules can subsequently be used for explaining GNN decisions. Experiments on both synthetic and real-life datasets show highly competitive performance, with up to 200% improvement in fidelity on explaining graph classification over the SOTA methods.

On GNN explanability with activation rules

TL;DR

The paper tackles explainability for graph neural networks by mining activation rules across hidden layers to reveal internal representations used for graph classification. It introduces INSIDE-GNN, which constructs per-layer activation matrices and uses a background model with subjective interestingness to enumerate non-redundant activation rules that cover the data. The method yields instance-level explanations and high-level insights into hidden features, and the authors compare favorably against state-of-the-art explainability methods, reporting up to 200% fidelity gains. Pattern languages for summarizing rules (numerical subgroups, discriminant subgraphs) enable interpretable descriptions of what the GNN learns across layers, supporting knowledge discovery.

Abstract

GNNs are powerful models based on node representation learning that perform particularly well in many machine learning problems related to graphs. The major obstacle to the deployment of GNNs is mostly a problem of societal acceptability and trustworthiness, properties which require making explicit the internal functioning of such models. Here, we propose to mine activation rules in the hidden layers to understand how the GNNs perceive the world. The problem is not to discover activation rules that are individually highly discriminating for an output of the model. Instead, the challenge is to provide a small set of rules that cover all input graphs. To this end, we introduce the subjective activation pattern domain. We define an effective and principled algorithm to enumerate activations rules in each hidden layer. The proposed approach for quantifying the interest of these rules is rooted in information theory and is able to account for background knowledge on the input graph data. The activation rules can then be redescribed thanks to pattern languages involving interpretable features. We show that the activation rules provide insights on the characteristics used by the GNN to classify the graphs. Especially, this allows to identify the hidden features built by the GNN through its different layers. Also, these rules can subsequently be used for explaining GNN decisions. Experiments on both synthetic and real-life datasets show highly competitive performance, with up to 200% improvement in fidelity on explaining graph classification over the SOTA methods.
Paper Structure (5 sections, 3 equations, 2 figures)

This paper contains 5 sections, 3 equations, 2 figures.

Figures (2)

  • Figure 1: Overview INSIDE-GNN: For each layer, (1) a binary matrix encodes the activation by nodes of embedding vector components. (2) A background model synthesizes the knowledge we have of these data: at the beginning, the probabilities of the components to be activated are independent to the nodes of the graphs. (3) The most informative activation rule (with respect to the background knowledge) is extracted by INSIDE-GNN. (4) This rule is integrated into the background knowledge which gradually makes the marginal distributions of the margins of the background model less and less independent. It is then added to the pattern set (5). Steps (2-5) are repeated until no rule brings significant information about the data in the table. Then, the activation rules are used (6) to support instance level explanations or (7) to provide insights on the model.
  • Figure 2: Toy example: The internal GNN representation of 4 graphs on the third layer with $K=6$.

Theorems & Definitions (2)

  • definition thmcounterdefinition: Activation matrix
  • definition thmcounterdefinition: Activation rule and support