Table of Contents
Fetching ...

Utilizing Description Logics for Global Explanations of Heterogeneous Graph Neural Networks

Dominik Köhler, Stefan Heindorf

TL;DR

We address the lack of global explanations for heterogeneous GNNs by introducing class-expression-based explanations drawn from description logic. The method searches for expressive CE explanations via beam search, scoring them either by fidelity on a validation set or by how well the GNN’s predictions align with graphs generated to satisfy the CE. Experiments on a heterogeneous BA-Shapes dataset show that fidelity-based explanations can reveal consistent, ground-truth motifs and enable detection of spurious correlations, while the approach remains model-agnostic and scalable with GNN depth. This work provides semantically precise, human-understandable explanations that can assist debugging, model validation, and deeper insight into learned GNN behavior on complex, multi-type graphs.

Abstract

Graph Neural Networks (GNNs) are effective for node classification in graph-structured data, but they lack explainability, especially at the global level. Current research mainly utilizes subgraphs of the input as local explanations or generates new graphs as global explanations. However, these graph-based methods are limited in their ability to explain classes with multiple sufficient explanations. To provide more expressive explanations, we propose utilizing class expressions (CEs) from the field of description logic (DL). Our approach explains heterogeneous graphs with different types of nodes using CEs in the EL description logic. To identify the best explanation among multiple candidate explanations, we employ and compare two different scoring functions: (1) For a given CE, we construct multiple graphs, have the GNN make a prediction for each graph, and aggregate the predicted scores. (2) We score the CE in terms of fidelity, i.e., we compare the predictions of the GNN to the predictions by the CE on a separate validation set. Instead of subgraph-based explanations, we offer CE-based explanations.

Utilizing Description Logics for Global Explanations of Heterogeneous Graph Neural Networks

TL;DR

We address the lack of global explanations for heterogeneous GNNs by introducing class-expression-based explanations drawn from description logic. The method searches for expressive CE explanations via beam search, scoring them either by fidelity on a validation set or by how well the GNN’s predictions align with graphs generated to satisfy the CE. Experiments on a heterogeneous BA-Shapes dataset show that fidelity-based explanations can reveal consistent, ground-truth motifs and enable detection of spurious correlations, while the approach remains model-agnostic and scalable with GNN depth. This work provides semantically precise, human-understandable explanations that can assist debugging, model validation, and deeper insight into learned GNN behavior on complex, multi-type graphs.

Abstract

Graph Neural Networks (GNNs) are effective for node classification in graph-structured data, but they lack explainability, especially at the global level. Current research mainly utilizes subgraphs of the input as local explanations or generates new graphs as global explanations. However, these graph-based methods are limited in their ability to explain classes with multiple sufficient explanations. To provide more expressive explanations, we propose utilizing class expressions (CEs) from the field of description logic (DL). Our approach explains heterogeneous graphs with different types of nodes using CEs in the EL description logic. To identify the best explanation among multiple candidate explanations, we employ and compare two different scoring functions: (1) For a given CE, we construct multiple graphs, have the GNN make a prediction for each graph, and aggregate the predicted scores. (2) We score the CE in terms of fidelity, i.e., we compare the predictions of the GNN to the predictions by the CE on a separate validation set. Instead of subgraph-based explanations, we offer CE-based explanations.
Paper Structure (41 sections, 5 equations, 7 figures, 3 tables, 4 algorithms)

This paper contains 41 sections, 5 equations, 7 figures, 3 tables, 4 algorithms.

Figures (7)

  • Figure 1: In a financial network, where each node type is a specific account, one asset account A might be labeled as fraudulent if it has transactions t with at least two accounts of type business or personal (nodes of type B or P), which both transact money to (possibly distinct) known criminal accounts (nodes of type C). The two left-hand graphs refers to the CE $\text{A} \sqcap \exists \text{t.} \left(\text{B} \sqcap \exists \text{t.C} \right) \sqcap \exists\text{t.}\left( \text{P} \sqcap \exists \text{t.C} \right)$ in $\mathcal{EL}$, which we implemented in our approach, whereas the class expression (CE) on the right side refers to the more accurate CE $\text{A} \exists_{\ge 2} \text{t. ((B $\sqcup$ P) $\exists$C) }$, an approach we left for future work.
  • Figure 2: An overview of our approach: We start with class expressions (CEs), and score them individually by using the GNN on graphs that have a node fulfilling the CE. For the next iteration, we take (1) the best results and (2) mutated versions of the best results.
  • Figure 3: Two possibilities for mutating the CE A $\sqcap\space\exists$r. B.
  • Figure 4: Different graphs for the CE $\text{A} \sqcap\exists \text{r.} (\text{B} \sqcap \exists \text{r}. (\text{B} \sqcap \exists \text{r. A}))$.
  • Figure 5: One house motif (left) from our Hetero-BA-Shapes dataset that is connected to a random Barabasi-Albert Graph (right). The colors indicate the node type (red indicates type A, blue type B, orange type C, and green type D). The numbers in the nodes indicate the label. The red node on the right does not receive the label 1 because it is not part of a house motif.
  • ...and 2 more figures

Theorems & Definitions (1)

  • definition 1