Revisiting Random Walks for Learning on Graphs
Jinwoo Kim, Olga Zaghen, Ayhan Suleymanzade, Youngmin Ryou, Seunghoon Hong
TL;DR
This work formalizes random walk neural networks (RWNNs) for graph learning, integrating a random walk-induced record with a flexible reader neural network. By enforcing probabilistic isomorphism invariance through anonymization and named-neighbor recording, RWNNs can universally approximate graph functions in probability, surpassing conventional WL-based expressiveness. The authors connect RWNNs to Markov-chain theory, showing how restarts and walk design mitigate under-reaching and reduce issues like over-smoothing seen in MPNNs. Empirically, RWNNs demonstrate strong performance on graph isomorphism tasks, including SR25 with a DeBERTa reader, and show competitive transductive classification on large graphs using Llama3, highlighting the practical potential of text-based walk records and language-model readers for graph tasks.
Abstract
We revisit a simple model class for machine learning on graphs, where a random walk on a graph produces a machine-readable record, and this record is processed by a deep neural network to directly make vertex-level or graph-level predictions. We call these stochastic machines random walk neural networks (RWNNs), and through principled analysis, show that we can design them to be isomorphism invariant while capable of universal approximation of graph functions in probability. A useful finding is that almost any kind of record of random walks guarantees probabilistic invariance as long as the vertices are anonymized. This enables us, for example, to record random walks in plain text and adopt a language model to read these text records to solve graph tasks. We further establish a parallelism to message passing neural networks using tools from Markov chain theory, and show that over-smoothing in message passing is alleviated by construction in RWNNs, while over-squashing manifests as probabilistic under-reaching. We empirically demonstrate RWNNs on a range of problems, verifying our theoretical analysis and demonstrating the use of language models for separating strongly regular graphs where 3-WL test fails, and transductive classification on arXiv citation network. Code is available at https://github.com/jw9730/random-walk.
