Table of Contents
Fetching ...

Rule Based Learning with Dynamic (Graph) Neural Networks

Florian Seiffarth

TL;DR

This work tackles the challenge of injecting expert knowledge into neural networks by proposing rule-based layers that dynamically arrange learnable parameters conditioned on input. The core idea is a two-step framework: derive rules from domain knowledge and convert them into rule functions that select weight and bias placements, enabling dynamic, input-specific architectures; the paper focuses on the second step and formalizes a general rule-based layer $f(x,\Theta,\mathbf{R})=\sigma(W_{\mathbf{R}_W(x)}\cdot x+b_{\mathbf{R}_b(x)})$. It introduces RuleGNNs, a graph-classification architecture built from Weisfeiler-Leman and pattern-counting rules plus an aggregation layer, proving permutation equivariance and competitive performance on real-world benchmarks, while synthetic datasets illustrate the benefit of explicit expert-knowledge rules for long-range dependencies and interpretability. The results highlight the practical potential of knowledge-informed dynamic networks for graphs, with interpretability and adaptability as key advantages, and point to future directions such as automatic rule learning and rule-based pruning to enhance efficiency and transferability.

Abstract

A common problem of classical neural network architectures is that additional information or expert knowledge cannot be naturally integrated into the learning process. To overcome this limitation, we propose a two-step approach consisting of (1) generating rule functions from knowledge and (2) using these rules to define rule based layers -- a new type of dynamic neural network layer. The focus of this work is on the second step, i.e., rule based layers that are designed to dynamically arrange learnable parameters in the weight matrices and bias vectors depending on the input samples. Indeed, we prove that our approach generalizes classical feed-forward layers such as fully connected and convolutional layers by choosing appropriate rules. As a concrete application we present rule based graph neural networks (RuleGNNs) that overcome some limitations of ordinary graph neural networks. Our experiments show that the predictive performance of RuleGNNs is comparable to state-of-the-art graph classifiers using simple rules based on Weisfeiler-Leman labeling and pattern counting. Moreover, we introduce new synthetic benchmark graph datasets to show how to integrate expert knowledge into RuleGNNs making them more powerful than ordinary graph neural networks.

Rule Based Learning with Dynamic (Graph) Neural Networks

TL;DR

This work tackles the challenge of injecting expert knowledge into neural networks by proposing rule-based layers that dynamically arrange learnable parameters conditioned on input. The core idea is a two-step framework: derive rules from domain knowledge and convert them into rule functions that select weight and bias placements, enabling dynamic, input-specific architectures; the paper focuses on the second step and formalizes a general rule-based layer . It introduces RuleGNNs, a graph-classification architecture built from Weisfeiler-Leman and pattern-counting rules plus an aggregation layer, proving permutation equivariance and competitive performance on real-world benchmarks, while synthetic datasets illustrate the benefit of explicit expert-knowledge rules for long-range dependencies and interpretability. The results highlight the practical potential of knowledge-informed dynamic networks for graphs, with interpretability and adaptability as key advantages, and point to future directions such as automatic rule learning and rule-based pruning to enhance efficiency and transferability.

Abstract

A common problem of classical neural network architectures is that additional information or expert knowledge cannot be naturally integrated into the learning process. To overcome this limitation, we propose a two-step approach consisting of (1) generating rule functions from knowledge and (2) using these rules to define rule based layers -- a new type of dynamic neural network layer. The focus of this work is on the second step, i.e., rule based layers that are designed to dynamically arrange learnable parameters in the weight matrices and bias vectors depending on the input samples. Indeed, we prove that our approach generalizes classical feed-forward layers such as fully connected and convolutional layers by choosing appropriate rules. As a concrete application we present rule based graph neural networks (RuleGNNs) that overcome some limitations of ordinary graph neural networks. Our experiments show that the predictive performance of RuleGNNs is comparable to state-of-the-art graph classifiers using simple rules based on Weisfeiler-Leman labeling and pattern counting. Moreover, we introduce new synthetic benchmark graph datasets to show how to integrate expert knowledge into RuleGNNs making them more powerful than ordinary graph neural networks.
Paper Structure (32 sections, 6 theorems, 13 equations, 6 figures, 6 tables)

This paper contains 32 sections, 6 theorems, 13 equations, 6 figures, 6 tables.

Key Result

Proposition 1

Let $f(-, \Theta,\mathbf{R}_{\operatorname{FC}}):\mathbb{R}^{n}\longrightarrow\mathbb{R}^{m}$ with be a rule based layer of a neural network as defined in eq:RBL (without bias term) with learnable parameters $\Theta=\{w_1, \ldots, w_{n\cdot m}\}$ and $y=\mathbf{f}^{i}(x)$ is the result of the first $i-1$ layers. Then for the rule function $\mathbf{R}_{\operatorname{FC}}(x):[m]\times[n]\rightarrow

Figures (6)

  • Figure 1: Visualization of the learned parameters of the best RuleGNN model on DHFR (a) and IMDB-BINARY (b) for three different random graphs from the test set. The label of the graph is given on the left side of the figure. Positive weights are denoted by red arrows and negative weights by blue arrows. The thickness and color corresponds to the absolute value of the weight. The size of the nodes corresponds to the bias values. The second to fourth columns of (a) resp. (b) show all, the $10$ and the $5$ largest positive and negative weights.
  • Figure 2: Information propagation in a simple two layer RuleGNN based on the molecule graphs of ethylene (left) and cyclopropenylidene (right) and the rules $\mathbf{R}_{\operatorname{Mol}}$\ref{['eq:RuleMol']} and $\mathbf{R}_{\operatorname{Aggr}}^k$\ref{['eq:RuleMolFinal']}. The input signal is propagated from left to right. The graph nodes represent the neurons of the neural network. Edges of the same color denote shared weights in a layer.
  • Figure 3: Molecule graphs of ethylene (left) and cyclopropenylidene (right). The indices denote the order of the nodes.
  • Figure 4: Example graphs from the Snowflakes dataset. The brown node in the circle is labeled by $1$ and the other nodes by $0$. The label of the graph is determined by the subgraph attached to the brown node.
  • Figure 5: The graphs $M_0, M_1, M_2$ and $M_3$naik2024iterative that are not distinguishable by the 1-WL test.
  • ...and 1 more figures

Theorems & Definitions (11)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Theorem 1
  • Proposition 3
  • proof
  • Proposition 4
  • proof
  • Theorem 2: Expressive Power of RuleGNNs
  • ...and 1 more