Table of Contents
Fetching ...

Revisiting Random Walks for Learning on Graphs

Jinwoo Kim, Olga Zaghen, Ayhan Suleymanzade, Youngmin Ryou, Seunghoon Hong

TL;DR

This work formalizes random walk neural networks (RWNNs) for graph learning, integrating a random walk-induced record with a flexible reader neural network. By enforcing probabilistic isomorphism invariance through anonymization and named-neighbor recording, RWNNs can universally approximate graph functions in probability, surpassing conventional WL-based expressiveness. The authors connect RWNNs to Markov-chain theory, showing how restarts and walk design mitigate under-reaching and reduce issues like over-smoothing seen in MPNNs. Empirically, RWNNs demonstrate strong performance on graph isomorphism tasks, including SR25 with a DeBERTa reader, and show competitive transductive classification on large graphs using Llama3, highlighting the practical potential of text-based walk records and language-model readers for graph tasks.

Abstract

We revisit a simple model class for machine learning on graphs, where a random walk on a graph produces a machine-readable record, and this record is processed by a deep neural network to directly make vertex-level or graph-level predictions. We call these stochastic machines random walk neural networks (RWNNs), and through principled analysis, show that we can design them to be isomorphism invariant while capable of universal approximation of graph functions in probability. A useful finding is that almost any kind of record of random walks guarantees probabilistic invariance as long as the vertices are anonymized. This enables us, for example, to record random walks in plain text and adopt a language model to read these text records to solve graph tasks. We further establish a parallelism to message passing neural networks using tools from Markov chain theory, and show that over-smoothing in message passing is alleviated by construction in RWNNs, while over-squashing manifests as probabilistic under-reaching. We empirically demonstrate RWNNs on a range of problems, verifying our theoretical analysis and demonstrating the use of language models for separating strongly regular graphs where 3-WL test fails, and transductive classification on arXiv citation network. Code is available at https://github.com/jw9730/random-walk.

Revisiting Random Walks for Learning on Graphs

TL;DR

This work formalizes random walk neural networks (RWNNs) for graph learning, integrating a random walk-induced record with a flexible reader neural network. By enforcing probabilistic isomorphism invariance through anonymization and named-neighbor recording, RWNNs can universally approximate graph functions in probability, surpassing conventional WL-based expressiveness. The authors connect RWNNs to Markov-chain theory, showing how restarts and walk design mitigate under-reaching and reduce issues like over-smoothing seen in MPNNs. Empirically, RWNNs demonstrate strong performance on graph isomorphism tasks, including SR25 with a DeBERTa reader, and show competitive transductive classification on large graphs using Llama3, highlighting the practical potential of text-based walk records and language-model readers for graph tasks.

Abstract

We revisit a simple model class for machine learning on graphs, where a random walk on a graph produces a machine-readable record, and this record is processed by a deep neural network to directly make vertex-level or graph-level predictions. We call these stochastic machines random walk neural networks (RWNNs), and through principled analysis, show that we can design them to be isomorphism invariant while capable of universal approximation of graph functions in probability. A useful finding is that almost any kind of record of random walks guarantees probabilistic invariance as long as the vertices are anonymized. This enables us, for example, to record random walks in plain text and adopt a language model to read these text records to solve graph tasks. We further establish a parallelism to message passing neural networks using tools from Markov chain theory, and show that over-smoothing in message passing is alleviated by construction in RWNNs, while over-squashing manifests as probabilistic under-reaching. We empirically demonstrate RWNNs on a range of problems, verifying our theoretical analysis and demonstrating the use of language models for separating strongly regular graphs where 3-WL test fails, and transductive classification on arXiv citation network. Code is available at https://github.com/jw9730/random-walk.
Paper Structure (58 sections, 25 theorems, 103 equations, 15 figures, 14 tables, 4 algorithms)

This paper contains 58 sections, 25 theorems, 103 equations, 15 figures, 14 tables, 4 algorithms.

Key Result

Proposition 2.1

$X_\theta(\cdot)$ is invariant in probability, if its random walk algorithm is invariant in probability and its recording function is invariant.

Figures (15)

  • Figure 1: An RWNN that reads text record using a language model.
  • Figure 2: Cover times of random walks on varying sizes of lollipop graphs. NB is non-backtracking.
  • Figure 3: Over-smoothing and over-squashing (MSE $\downarrow$) for various walk lengths $l$.
  • Figure 4: Two CSL graphs and text records of random walks from Algorithms \ref{['alg:random_walk']} and \ref{['alg:recording_function_anonymization_neighborhoods']}. Task is graph classification into 10 isomorphism types $\mathrm{csl}(41, s)$ for skip length $s\in \{2, 3, 4, 5, 6, 9, 11, 12, 13, 16\}$. In random walks, we label vertices by anonymization and color edges by their time of discovery.
  • Figure 5: SR16 graphs and text records of random walks from Algorithms \ref{['alg:random_walk']} and \ref{['alg:recording_function_anonymization_neighborhoods']}. The task is graph classification into two isomorphism types $\{4\times4\text{ rook's graph}, \text{Shrikhande graph}\}$. In random walks, we label vertices by anonymization and color edges by their time of discovery.
  • ...and 10 more figures

Theorems & Definitions (44)

  • Proposition 2.1
  • Proposition 2.2
  • Proposition 2.3
  • Theorem 2.4
  • Theorem 2.5
  • Definition 3.1
  • Theorem 3.2
  • Definition 3.3
  • Theorem 3.4
  • Theorem 3.5
  • ...and 34 more