Comparing Graph Transformers via Positional Encodings
Mitchell Black, Zhengchao Wan, Gal Mishne, Amir Nayyeri, Yusu Wang
TL;DR
This work develops a theoretical framework to compare graph transformers with absolute and relative positional encodings (APE-GTs and RPE-GTs). It proves that APEs and RPEs can be exchanged without loss of distinguishing power for unfeatured graphs, while revealing that RPEs may outperform APEs when node features are present. The authors introduce RPE-augWL and connect RPEs to WL variants and 2-EGNs, enabling rigorous comparisons across encodings and guiding future PE design. They also provide extensive case studies of SPE, resistance distance, and other common encodings, showing practical implications for graph learning tasks and highlighting when converting between encodings is beneficial or detrimental. Overall, the results offer principled guidance for selecting and designing positional encodings in graph transformers, with implications for both theory and practice.
Abstract
The distinguishing power of graph transformers is closely tied to the choice of positional encoding: features used to augment the base transformer with information about the graph. There are two primary types of positional encoding: absolute positional encodings (APEs) and relative positional encodings (RPEs). APEs assign features to each node and are given as input to the transformer. RPEs instead assign a feature to each pair of nodes, e.g., graph distance, and are used to augment the attention block. A priori, it is unclear which method is better for maximizing the power of the resulting graph transformer. In this paper, we aim to understand the relationship between these different types of positional encodings. Interestingly, we show that graph transformers using APEs and RPEs are equivalent in terms of distinguishing power. In particular, we demonstrate how to interchange APEs and RPEs while maintaining their distinguishing power in terms of graph transformers. Based on our theoretical results, we provide a study on several APEs and RPEs (including the resistance distance and the recently introduced stable and expressive positional encoding (SPE)) and compare their distinguishing power in terms of transformers. We believe our work will help navigate the huge number of choices of positional encoding and will provide guidance on the future design of positional encodings for graph transformers.
