Table of Contents
Fetching ...

Relational Graph Convolutional Networks Do Not Learn Sound Rules

Matthew Morris, David J. Tena Cucala, Bernardo Cuenca Grau, Ian Horrocks

TL;DR

This work investigates the explainability of relational GNNs for knowledge-graph completion by linking their predictions to Datalog rules. It introduces a channel-based framework to identify monotonic substructures (safe, stable, increasing) and to detect unbounded channels for which no sound rules exist, enabling rigorous rule extraction or counterexamples. Experiments reveal that standard R-GCN training yields all channels unbounded on both monotonic and mixed benchmarks, undermining faithfulness of explanations. By clamping small weights to zero (or otherwise enforcing monotonicity) the authors show a trade-off: more sound rules can be learned, but predictive accuracy can suffer, highlighting a key interpretability–performance tension and offering practical pathways for more faithful KG reasoning systems.

Abstract

Graph neural networks (GNNs) are frequently used to predict missing facts in knowledge graphs (KGs). Motivated by the lack of explainability for the outputs of these models, recent work has aimed to explain their predictions using Datalog, a widely used logic-based formalism. However, such work has been restricted to certain subclasses of GNNs. In this paper, we consider one of the most popular GNN architectures for KGs, R-GCN, and we provide two methods to extract rules that explain its predictions and are sound, in the sense that each fact derived by the rules is also predicted by the GNN, for any input dataset. Furthermore, we provide a method that can verify that certain classes of Datalog rules are not sound for the R-GCN. In our experiments, we train R-GCNs on KG completion benchmarks, and we are able to verify that no Datalog rule is sound for these models, even though the models often obtain high to near-perfect accuracy. This raises some concerns about the ability of R-GCN models to generalise and about the explainability of their predictions. We further provide two variations to the training paradigm of R-GCN that encourage it to learn sound rules and find a trade-off between model accuracy and the number of learned sound rules.

Relational Graph Convolutional Networks Do Not Learn Sound Rules

TL;DR

This work investigates the explainability of relational GNNs for knowledge-graph completion by linking their predictions to Datalog rules. It introduces a channel-based framework to identify monotonic substructures (safe, stable, increasing) and to detect unbounded channels for which no sound rules exist, enabling rigorous rule extraction or counterexamples. Experiments reveal that standard R-GCN training yields all channels unbounded on both monotonic and mixed benchmarks, undermining faithfulness of explanations. By clamping small weights to zero (or otherwise enforcing monotonicity) the authors show a trade-off: more sound rules can be learned, but predictive accuracy can suffer, highlighting a key interpretability–performance tension and offering practical pathways for more faithful KG reasoning systems.

Abstract

Graph neural networks (GNNs) are frequently used to predict missing facts in knowledge graphs (KGs). Motivated by the lack of explainability for the outputs of these models, recent work has aimed to explain their predictions using Datalog, a widely used logic-based formalism. However, such work has been restricted to certain subclasses of GNNs. In this paper, we consider one of the most popular GNN architectures for KGs, R-GCN, and we provide two methods to extract rules that explain its predictions and are sound, in the sense that each fact derived by the rules is also predicted by the GNN, for any input dataset. Furthermore, we provide a method that can verify that certain classes of Datalog rules are not sound for the R-GCN. In our experiments, we train R-GCNs on KG completion benchmarks, and we are able to verify that no Datalog rule is sound for these models, even though the models often obtain high to near-perfect accuracy. This raises some concerns about the ability of R-GCN models to generalise and about the explainability of their predictions. We further provide two variations to the training paradigm of R-GCN that encourage it to learn sound rules and find a trade-off between model accuracy and the number of learned sound rules.
Paper Structure (26 sections, 14 theorems, 24 equations, 2 figures, 6 tables)

This paper contains 26 sections, 14 theorems, 24 equations, 2 figures, 6 tables.

Key Result

Proposition 1

If $\alpha$ is a rule or program sound for sum-GNN $\mathcal{N}$, then for any dataset $D$ and $k \in \mathbb{N}$, the containment holds when $T_\alpha$ and $T_\mathcal{N}$ are composed $k$ times: $T_\alpha^k(D) \subseteq T_{\mathcal{N}}^k(D)$.

Figures (2)

  • Figure 1: Canonical encoding $G_d$ of the dataset $D_d$ that extends $D^{\nu}_r$ in the proof of Theorem \ref{['thm:neginfline_sound']} for a given $d \in \mathbb{N}$.
  • Figure 2: Succinct representation of the encoding of dataset $D_d$ in the family of datasets used in our counterexample.

Theorems & Definitions (27)

  • Proposition 1
  • Definition 2
  • Lemma 3
  • proof
  • Proposition 4
  • proof
  • Definition 5
  • Lemma 6
  • proof : Proof sketch
  • Proposition 7
  • ...and 17 more