Table of Contents
Fetching ...

Beyond Message Passing: Neural Graph Pattern Machine

Zehong Wang, Zheyuan Zhang, Tianyi Ma, Nitesh V Chawla, Chuxu Zhang, Yanfang Ye

TL;DR

GPM introduces a paradigm shift from message passing to pattern-based learning by sampling graph substructures with a random-walk tokenizer, encoding semantic and anonymous path information, and aggregating dominant patterns via a transformer. By design, it improves expressivity beyond 1-WL and mitigates over-squashing, supported by empirical gains across node, link, and graph tasks and robust out-of-distribution generalization. The framework also offers interpretability through a class token that highlights influential patterns such as stars and rings in graphs. Its scalability is demonstrated on large graphs with efficient training and potential for future integration with adaptive pattern vocabularies or unsupervised learning. This approach broadens the toolbox for graph representation learning, emphasizing direct pattern-centric reasoning over neighborhood aggregation.

Abstract

Graph learning tasks often hinge on identifying key substructure patterns -- such as triadic closures in social networks or benzene rings in molecular graphs -- that underpin downstream performance. However, most existing graph neural networks (GNNs) rely on message passing, which aggregates local neighborhood information iteratively and struggles to explicitly capture such fundamental motifs, like triangles, k-cliques, and rings. This limitation hinders both expressiveness and long-range dependency modeling. In this paper, we introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures. GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies. Empirical evaluations across four standard tasks -- node classification, link prediction, graph classification, and graph regression -- demonstrate that GPM outperforms state-of-the-art baselines. Further analysis reveals that GPM exhibits strong out-of-distribution generalization, desirable scalability, and enhanced interpretability. Code and datasets are available at: https://github.com/Zehong-Wang/GPM.

Beyond Message Passing: Neural Graph Pattern Machine

TL;DR

GPM introduces a paradigm shift from message passing to pattern-based learning by sampling graph substructures with a random-walk tokenizer, encoding semantic and anonymous path information, and aggregating dominant patterns via a transformer. By design, it improves expressivity beyond 1-WL and mitigates over-squashing, supported by empirical gains across node, link, and graph tasks and robust out-of-distribution generalization. The framework also offers interpretability through a class token that highlights influential patterns such as stars and rings in graphs. Its scalability is demonstrated on large graphs with efficient training and potential for future integration with adaptive pattern vocabularies or unsupervised learning. This approach broadens the toolbox for graph representation learning, emphasizing direct pattern-centric reasoning over neighborhood aggregation.

Abstract

Graph learning tasks often hinge on identifying key substructure patterns -- such as triadic closures in social networks or benzene rings in molecular graphs -- that underpin downstream performance. However, most existing graph neural networks (GNNs) rely on message passing, which aggregates local neighborhood information iteratively and struggles to explicitly capture such fundamental motifs, like triangles, k-cliques, and rings. This limitation hinders both expressiveness and long-range dependency modeling. In this paper, we introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures. GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies. Empirical evaluations across four standard tasks -- node classification, link prediction, graph classification, and graph regression -- demonstrate that GPM outperforms state-of-the-art baselines. Further analysis reveals that GPM exhibits strong out-of-distribution generalization, desirable scalability, and enhanced interpretability. Code and datasets are available at: https://github.com/Zehong-Wang/GPM.

Paper Structure

This paper contains 36 sections, 7 theorems, 8 equations, 8 figures, 12 tables.

Key Result

Proposition 3.2

(Informal) Given a node $v \in {\mathcal{V}}$, assume the task requires information from the $k$-hop ego-graph ${\mathcal{B}}(v, k) = ({\mathcal{V}}', {\mathcal{E}}')$. A sufficiently large set of patterns, sampled via $l$-length anonymous walks with $l = O(|{\mathcal{E}}'|)$, can provide distinguis

Figures (8)

  • Figure 1: The workflow of Neural Graph Pattern Machine (GPM). Given a graph dataset, GPM utilizes a random walk tokenizer to extract a set of patterns representing the learning instances (nodes, edges, or graphs). These patterns are first encoded by a sequential model and then processed by a transformer encoder, which identifies the dominant patterns relevant to downstream tasks.
  • Figure 2: The overview framework of GPM.
  • Figure 3: Examples of anonymous paths.
  • Figure 4: Model scalability analysis. (Top) Number of Parameters vs. Accuracy. (Bottom) Number of GPUs vs. Acceleration Ratio.
  • Figure 5: Training loss and model performance on Products with varying sampling criteria.
  • ...and 3 more figures

Theorems & Definitions (8)

  • Definition 3.1: Anonymous Walk ivanov2018anonymous
  • Proposition 3.2
  • Proposition 3.3
  • Theorem 3.4
  • Corollary 3.5
  • Theorem 3.1: Theorem 1 of micali2016reconstructing
  • Theorem 3.4: Theorem 4 of wang2024nonconvolutional
  • Theorem 3.5: Corollary 4.1 of wang2024nonconvolutional