Improving the Expressiveness of $K$-hop Message-Passing GNNs by Injecting Contextualized Substructure Information

Tianjun Yao; Yiongxu Wang; Kun Zhang; Shangsong Liang

Improving the Expressiveness of $K$-hop Message-Passing GNNs by Injecting Contextualized Substructure Information

Tianjun Yao, Yiongxu Wang, Kun Zhang, Shangsong Liang

TL;DR

This work identifies a fundamental expressivity limit of standard MPGNNs, bounded by the $1$-WL test, and shows that $K$-hop message-passing struggles to capture internal substructure within $K$-hop ego-nets. It introduces a substructure encoding function $f$ and contextualized substructure information, culminating in the SEK-1-WL color refinement and SEK-GNN, which are provably more powerful than $K$-hop $1$-WL and Subgraph $1$-WL and competitive with or surpassing $3$-WL. The encoding relies on an efficient random-walk-based feature set, including self-return probabilities and related landing probabilities, enabling scalable, parallelizable computation. Empirically, SEK-GNN achieves state-of-the-art or competitive results on synthetic benchmarks, graph classification suites, and QM9 molecular properties, while maintaining lower space complexity than many subgraph-based methods. The work therefore provides a practical, theoretically grounded path to more expressive GNNs without incurring prohibitive costs, with potential extensions to other subgraph-based architectures and encodings.

Abstract

Graph neural networks (GNNs) have become the \textit{de facto} standard for representational learning in graphs, and have achieved state-of-the-art performance in many graph-related tasks; however, it has been shown that the expressive power of standard GNNs are equivalent maximally to 1-dimensional Weisfeiler-Lehman (1-WL) Test. Recently, there is a line of works aiming to enhance the expressive power of graph neural networks. One line of such works aim at developing $K$-hop message-passing GNNs where node representation is updated by aggregating information from not only direct neighbors but all neighbors within $K$-hop of the node. Another line of works leverages subgraph information to enhance the expressive power which is proven to be strictly more powerful than 1-WL test. In this work, we discuss the limitation of $K$-hop message-passing GNNs and propose \textit{substructure encoding function} to uplift the expressive power of any $K$-hop message-passing GNN. We further inject contextualized substructure information to enhance the expressiveness of $K$-hop message-passing GNNs. Our method is provably more powerful than previous works on $K$-hop graph neural networks and 1-WL subgraph GNNs, which is a specific type of subgraph based GNN models, and not less powerful than 3-WL. Empirically, our proposed method set new state-of-the-art performance or achieves comparable performance for a variety of datasets. Our code is available at \url{https://github.com/tianyao-aka/Expresive_K_hop_GNNs}.

Improving the Expressiveness of $K$-hop Message-Passing GNNs by Injecting Contextualized Substructure Information

TL;DR

This work identifies a fundamental expressivity limit of standard MPGNNs, bounded by the

-WL test, and shows that

-hop message-passing struggles to capture internal substructure within

-hop ego-nets. It introduces a substructure encoding function

and contextualized substructure information, culminating in the SEK-1-WL color refinement and SEK-GNN, which are provably more powerful than

-hop

-WL and Subgraph

-WL and competitive with or surpassing

-WL. The encoding relies on an efficient random-walk-based feature set, including self-return probabilities and related landing probabilities, enabling scalable, parallelizable computation. Empirically, SEK-GNN achieves state-of-the-art or competitive results on synthetic benchmarks, graph classification suites, and QM9 molecular properties, while maintaining lower space complexity than many subgraph-based methods. The work therefore provides a practical, theoretically grounded path to more expressive GNNs without incurring prohibitive costs, with potential extensions to other subgraph-based architectures and encodings.

Abstract

-hop message-passing GNNs where node representation is updated by aggregating information from not only direct neighbors but all neighbors within

-hop of the node. Another line of works leverages subgraph information to enhance the expressive power which is proven to be strictly more powerful than 1-WL test. In this work, we discuss the limitation of

-hop message-passing GNNs and propose \textit{substructure encoding function} to uplift the expressive power of any

-hop message-passing GNN. We further inject contextualized substructure information to enhance the expressiveness of

-hop message-passing GNNs. Our method is provably more powerful than previous works on

-hop graph neural networks and 1-WL subgraph GNNs, which is a specific type of subgraph based GNN models, and not less powerful than 3-WL. Empirically, our proposed method set new state-of-the-art performance or achieves comparable performance for a variety of datasets. Our code is available at \url{https://github.com/tianyao-aka/Expresive_K_hop_GNNs}.

Paper Structure (21 sections, 5 theorems, 6 equations, 3 figures, 6 tables)

This paper contains 21 sections, 5 theorems, 6 equations, 3 figures, 6 tables.

Introduction
Preliminary
Notation
Weisfeiler-Lehman Test
More expressive GNNs
How to design the substructure encoding function
Substructure Enhanced $K$-hop 1-WL Algorithm
Space and time complexity
Related Work
Experiments
Synthetic datasets
Real-world datasets
Graph classification
Graph regression
Conclusion
...and 6 more sections

Key Result

Theorem 1

Given two $n$-node $r$-regular graphs $G$ and $H$, let $3 \leq r<(2 \log 2 n)^{1 / 2}$ and $\epsilon$ be a fixed constant. For two $K$-hop ego-networks $G_u^K$ and $H_v^K$ with $K$ being at most $\left\lceil\left(\frac{1}{2}+\epsilon\right) \frac{\log 2 n}{\log (r-1)} + 1 \right\rceil$, $2K$ steps o

Figures (3)

Figure 1: One example where node 1 in $G_1$ and $G_2$ induces the same attention pattern given a 1-layer 2-hop message-passing GNN method, and induces different attention patterns using a 1-layer 1-WL subgraph GNN where a 2-layer base GNN encoder is used.
Figure 2: Two non-isomorphic graphs with the same intersection array, $\left\{6,3;1,2\right\}$; for the two red nodes, we show one of the 2-hop neighbors highlighted in green, and their corresponding 1-hop induced subgraphs.
Figure 3: A toy example to illustrate how to inject contextualized substructure information to enrich the representation in SEK-GNN. Specifically, in this example the $K$-hop neighbors of node 1 are extracted for all $k \in [K]$ ($K=2$ in this case, and ranges between 3 and 6 in the experiments), and then for each node, we further extract its $h$-hop induced subgraphs ($h=1$ in this example and is typically 3 to 6 in the experiments) and calculate the substructure information using $f(\cdot)$. Then for each $k$-hop neighbors of node 1 where $k \in [K]$, the node representation consists of two parts: i) node embedding from the previous iteration and ii) encoded substructure features using $f(\cdot)$. Finally, node 1's representation is updated according to Equation \ref{['eq:sek-gnn']}. Please also note that the induced subgraph of node 1 is also extracted and encoded using $f(\cdot)$ which is omitted in this figure.

Theorems & Definitions (5)

Theorem 1
Theorem 2
Proposition 1
Lemma 1
Lemma 2

Improving the Expressiveness of $K$-hop Message-Passing GNNs by Injecting Contextualized Substructure Information

TL;DR

Abstract

Improving the Expressiveness of $K$-hop Message-Passing GNNs by Injecting Contextualized Substructure Information

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (5)