Benchmarking Positional Encodings for GNNs and Graph Transformers
Florian Grötschla, Jiaqing Xie, Roger Wattenhofer
TL;DR
This work addresses how to evaluate Positional Encodings (PEs) for Graph Neural Networks and Graph Transformers independently of architectural innovations. It proposes a unified benchmarking framework and conducts an expansive study across 8 architectures, 9 PEs, and 10 datasets, totaling over 500 configurations. The key finding is that higher theoretical expressiveness, as captured by WL-based metrics, does not consistently translate into improved downstream performance; some expressive encodings can even degrade results on real-world tasks. The results highlight task-dependent PE effectiveness, reveal simple configurations that rival state-of-the-art methods, and demonstrate that sparse attention can match full attention when paired with suitable PEs. The authors also provide an open-source benchmark to facilitate reproducible, future evaluation of PEs in graph learning.
Abstract
Positional Encodings (PEs) are essential for injecting structural information into Graph Neural Networks (GNNs), particularly Graph Transformers, yet their empirical impact remains insufficiently understood. We introduce a unified benchmarking framework that decouples PEs from architectural choices, enabling a fair comparison across 8 GNN and Transformer models, 9 PEs, and 10 synthetic and real-world datasets. Across more than 500 model-PE-dataset configurations, we find that commonly used expressiveness proxies, including Weisfeiler-Lehman distinguishability, do not reliably predict downstream performance. In particular, highly expressive PEs frequently fail to improve, and can even degrade performance on real-world tasks. At the same time, we identify several simple and previously overlooked model-PE combinations that match or outperform recent state-of-the-art methods. Our results demonstrate the strong task-dependence of PEs and underscore the need for empirical validation beyond theoretical expressiveness. To support reproducible research, we release an open-source benchmarking framework for evaluating PEs for graph learning tasks.
