Probabilistic Graph Rewiring via Virtual Nodes
Chendi Qian, Andrei Manolache, Christopher Morris, Mathias Niepert
TL;DR
This work tackles the persistent problems of under-reaching and over-squashing in message-passing GNNs by introducing implicit probabilistic rewiring through a small set of virtual nodes. An upstream model learns priors for connecting original nodes to virtual nodes under exact-$k$ constraints, and differentiable k-subset sampling enables end-to-end optimization without the quadratic costs of graph transformers. The resulting IPR-MPNN architecture provides long-range information flow with sub-quadratic complexity, theoretically surpasses standard MPNNs in expressiveness, and delivers state-of-the-art or competitive results across diverse graph and molecular benchmarks while maintaining superior efficiency. The approach offers a scalable, adaptable framework for capturing long-range dependencies in large graphs, with practical implications for chemistry, biology, and network science.
Abstract
Message-passing graph neural networks (MPNNs) have emerged as a powerful paradigm for graph-based machine learning. Despite their effectiveness, MPNNs face challenges such as under-reaching and over-squashing, where limited receptive fields and structural bottlenecks hinder information flow in the graph. While graph transformers hold promise in addressing these issues, their scalability is limited due to quadratic complexity regarding the number of nodes, rendering them impractical for larger graphs. Here, we propose implicitly rewired message-passing neural networks (IPR-MPNNs), a novel approach that integrates implicit probabilistic graph rewiring into MPNNs. By introducing a small number of virtual nodes, i.e., adding additional nodes to a given graph and connecting them to existing nodes, in a differentiable, end-to-end manner, IPR-MPNNs enable long-distance message propagation, circumventing quadratic complexity. Theoretically, we demonstrate that IPR-MPNNs surpass the expressiveness of traditional MPNNs. Empirically, we validate our approach by showcasing its ability to mitigate under-reaching and over-squashing effects, achieving state-of-the-art performance across multiple graph datasets. Notably, IPR-MPNNs outperform graph transformers while maintaining significantly faster computational efficiency.
