Table of Contents
Fetching ...

Improving the Effective Receptive Field of Message-Passing Neural Networks

Shahaf E. Finder, Ron Shapira Weber, Moshe Eliasof, Oren Freifeld, Eran Treister

TL;DR

This work addresses the limited effective receptive field and over-squashing in MPNNs by introducing Interleaved Multiscale MPNNs (IM-MPNN). IM-MPNN uses hierarchical graph coarsening and inter-scale scale-mixing to propagate information across distant regions without increasing depth or parameter count significantly. Theoretical ERF analysis and extensive experiments on Long Range Graph Benchmark (LRGB) show substantial gains in capturing long-range dependencies while maintaining efficiency, outperforming several baselines including GCN, GIN, and some Graph Transformers in various settings. The approach offers a principled, scalable way to enhance expressiveness of GNNs for large, complex graphs and heterophilic domains, with practical impact on tasks requiring long-range information flow.

Abstract

Message-Passing Neural Networks (MPNNs) have become a cornerstone for processing and analyzing graph-structured data. However, their effectiveness is often hindered by phenomena such as over-squashing, where long-range dependencies or interactions are inadequately captured and expressed in the MPNN output. This limitation mirrors the challenges of the Effective Receptive Field (ERF) in Convolutional Neural Networks (CNNs), where the theoretical receptive field is underutilized in practice. In this work, we show and theoretically explain the limited ERF problem in MPNNs. Furthermore, inspired by recent advances in ERF augmentation for CNNs, we propose an Interleaved Multiscale Message-Passing Neural Networks (IM-MPNN) architecture to address these problems in MPNNs. Our method incorporates a hierarchical coarsening of the graph, enabling message-passing across multiscale representations and facilitating long-range interactions without excessive depth or parameterization. Through extensive evaluations on benchmarks such as the Long-Range Graph Benchmark (LRGB), we demonstrate substantial improvements over baseline MPNNs in capturing long-range dependencies while maintaining computational efficiency.

Improving the Effective Receptive Field of Message-Passing Neural Networks

TL;DR

This work addresses the limited effective receptive field and over-squashing in MPNNs by introducing Interleaved Multiscale MPNNs (IM-MPNN). IM-MPNN uses hierarchical graph coarsening and inter-scale scale-mixing to propagate information across distant regions without increasing depth or parameter count significantly. Theoretical ERF analysis and extensive experiments on Long Range Graph Benchmark (LRGB) show substantial gains in capturing long-range dependencies while maintaining efficiency, outperforming several baselines including GCN, GIN, and some Graph Transformers in various settings. The approach offers a principled, scalable way to enhance expressiveness of GNNs for large, complex graphs and heterophilic domains, with practical impact on tasks requiring long-range information flow.

Abstract

Message-Passing Neural Networks (MPNNs) have become a cornerstone for processing and analyzing graph-structured data. However, their effectiveness is often hindered by phenomena such as over-squashing, where long-range dependencies or interactions are inadequately captured and expressed in the MPNN output. This limitation mirrors the challenges of the Effective Receptive Field (ERF) in Convolutional Neural Networks (CNNs), where the theoretical receptive field is underutilized in practice. In this work, we show and theoretically explain the limited ERF problem in MPNNs. Furthermore, inspired by recent advances in ERF augmentation for CNNs, we propose an Interleaved Multiscale Message-Passing Neural Networks (IM-MPNN) architecture to address these problems in MPNNs. Our method incorporates a hierarchical coarsening of the graph, enabling message-passing across multiscale representations and facilitating long-range interactions without excessive depth or parameterization. Through extensive evaluations on benchmarks such as the Long-Range Graph Benchmark (LRGB), we demonstrate substantial improvements over baseline MPNNs in capturing long-range dependencies while maintaining computational efficiency.

Paper Structure

This paper contains 45 sections, 22 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Measuring the contribution of each node to the output of the central node in a graph with a maximal distance of 10 hops from the center. A brighter color marks a larger contribution. We can see a decay akin to Luo:NIPS:2016:erf. (\ref{['subfig:10_0']})-(\ref{['subfig:30_0']}) are MPNNs (GCN) with 10, 20, and 30 MP layers, while (\ref{['subfig:10_1']})-(\ref{['subfig:10_3']}) are IM-MPNNs with a GCN backbone with 10 MP layers and 1, 2, and 3 scales.
  • Figure 2: IM-MPNN architecture for scales=3. The input is first passed through an encoding stage (PE, SE, Graph features, etc.). Then, the graph is coarsened $S$ times. MP protocols (GCN, GINE, GatedGCN, etc.) are performed on the $S+1$ scales of the graph separately. Scale-mix layers pass the information between consecutive scales, matching each graph node with its parent and child from the coarsening process. The process is repeated $L$ times. The coarse graphs are unpooled, and the node features are concatenated to the parent node in the original graph. A GNN head is used according to the task.
  • Figure 3: An infinitely-long linear graph.
  • Figure 4: The spread of a point source in time according to \ref{['eq:solHeat']}, for $d=2$. For $\kappa=0.005$, at time $t=1$ the ERF is about $0.3$, while for $t=5$, it grew to about $0.6$ only. The ERF is more spread for a higher value of $\kappa$.
  • Figure 5: Coarsening of a graph according to a given pairing (edges marked in red).
  • ...and 3 more figures