Table of Contents
Fetching ...

Learning Long Range Dependencies on Graphs via Random Walks

Dexiong Chen, Till Hendrik Schulz, Karsten Borgwardt

TL;DR

A novel architecture that overcomes the shortcomings of both message-passing graph neural networks and graph transformers by combining the long-range information of random walks with local message passing and leverages recent advances in sequence models to effectively capture long-range dependencies within these walks.

Abstract

Message-passing graph neural networks (GNNs) excel at capturing local relationships but struggle with long-range dependencies in graphs. In contrast, graph transformers (GTs) enable global information exchange but often oversimplify the graph structure by representing graphs as sets of fixed-length vectors. This work introduces a novel architecture that overcomes the shortcomings of both approaches by combining the long-range information of random walks with local message passing. By treating random walks as sequences, our architecture leverages recent advances in sequence models to effectively capture long-range dependencies within these walks. Based on this concept, we propose a framework that offers (1) more expressive graph representations through random walk sequences, (2) the ability to utilize any sequence model for capturing long-range dependencies, and (3) the flexibility by integrating various GNN and GT architectures. Our experimental evaluations demonstrate that our approach achieves significant performance improvements on 19 graph and node benchmark datasets, notably outperforming existing methods by up to 13\% on the PascalVoc-SP and COCO-SP datasets. The code is available at https://github.com/BorgwardtLab/NeuralWalker.

Learning Long Range Dependencies on Graphs via Random Walks

TL;DR

A novel architecture that overcomes the shortcomings of both message-passing graph neural networks and graph transformers by combining the long-range information of random walks with local message passing and leverages recent advances in sequence models to effectively capture long-range dependencies within these walks.

Abstract

Message-passing graph neural networks (GNNs) excel at capturing local relationships but struggle with long-range dependencies in graphs. In contrast, graph transformers (GTs) enable global information exchange but often oversimplify the graph structure by representing graphs as sets of fixed-length vectors. This work introduces a novel architecture that overcomes the shortcomings of both approaches by combining the long-range information of random walks with local message passing. By treating random walks as sequences, our architecture leverages recent advances in sequence models to effectively capture long-range dependencies within these walks. Based on this concept, we propose a framework that offers (1) more expressive graph representations through random walk sequences, (2) the ability to utilize any sequence model for capturing long-range dependencies, and (3) the flexibility by integrating various GNN and GT architectures. Our experimental evaluations demonstrate that our approach achieves significant performance improvements on 19 graph and node benchmark datasets, notably outperforming existing methods by up to 13\% on the PascalVoc-SP and COCO-SP datasets. The code is available at https://github.com/BorgwardtLab/NeuralWalker.
Paper Structure (61 sections, 15 theorems, 28 equations, 4 figures, 14 tables)

This paper contains 61 sections, 15 theorems, 28 equations, 4 figures, 14 tables.

Key Result

Theorem 4.2

For some functional space ${\mathcal{F}}$ of functions on walk feature vectors, we define the following distance $d_{{\mathcal{F}}}:{\mathcal{G}}\times {\mathcal{G}}\to{\mathbb R}_{+}$: Then $({\mathcal{G}}_n,d_{{\mathcal{F}},\ell})$ is a metric space if ${\mathcal{F}}$ is a universal space and $\ell\geq 4n^3$. If ${\mathcal{F}}$ contains $f$, then for any $G,G'\in {\mathcal{G}}_n$, we have In p

Figures (4)

  • Figure 1: Message passing efficiently captures locally sparse subgraphs, like $k$-star subgraphs, while random walks struggle, requiring a length of $2k$.
  • Figure 2: Overview of the NeuralWalker architecture. The random walk sampler samples $m$ random walks independently without replacement; the walk embedder computes walk embeddings given the node/edge embeddings at the current layer; the walk aggregator aggregates walk features into the node features via pooling of the node features encountered in all the walks passing through the node.
  • Figure 3: Validation performance when varying sampling rate and length of random walks.
  • Figure 4: An example of the identity encoding and adjacency encoding presented in Secion \ref{['sec:walk_sampler']}. On the random walk colored in red, we have $\mathrm{id}_W[4,3]=1$ as $w_4=w_0=6$. We have $\mathrm{adj}_W[3,2]=1$ as $w_3w_0\in E$ is an edge of the graph.

Theorems & Definitions (26)

  • Definition 4.1: Walk feature vector
  • Theorem 4.2: Lipschitz continuity
  • Theorem 4.3: Injectivity
  • Theorem 4.4
  • Theorem 4.5: Complexity
  • Definition C.1: Walk feature vector
  • Lemma C.2
  • proof
  • Lemma C.3: lovasz1993random
  • Lemma C.4: lovasz1993random
  • ...and 16 more