Towards Graph Foundation Models: A Study on the Generalization of Positional and Structural Encodings
Billy Joe Franks, Moshe Eliasof, Semih Cantürk, Guy Wolf, Carola-Bibiane Schönlieb, Sophie Fellenz, Marius Kloft
TL;DR
The paper investigates whether learnable graph positional and structural encodings (PSEs), exemplified by GPSE, can serve as universal building blocks for graph foundation models. It demonstrates that GPSE can function as a universal node encoder under mild assumptions, yet downstream universality is not guaranteed without randomness or task-specific pretraining; it also shows GPSE can accelerate convergence and improve data efficiency in many settings, though performance is dataset-dependent. Through extensive experiments on synthetic expressivity benchmarks and real molecular datasets (e.g., ZINC-12k, MolNet), the study reveals that GPSE and its variants often outperform baselines in generalization and data-scarce regimes, while not universally surpassing all baselines across all tasks. The findings suggest PSEs hold significant potential as integral components of future graph foundation models, while underscoring the need for improved generalization mechanisms across diverse graph domains.
Abstract
Recent advances in integrating positional and structural encodings (PSEs) into graph neural networks (GNNs) have significantly enhanced their performance across various graph learning tasks. However, the general applicability of these encodings and their potential to serve as foundational representations for graphs remain uncertain. This paper investigates the fine-tuning efficiency, scalability with sample size, and generalization capability of learnable PSEs across diverse graph datasets. Specifically, we evaluate their potential as universal pre-trained models that can be easily adapted to new tasks with minimal fine-tuning and limited data. Furthermore, we assess the expressivity of the learned representations, particularly, when used to augment downstream GNNs. We demonstrate through extensive benchmarking and empirical analysis that PSEs generally enhance downstream models. However, some datasets may require specific PSE-augmentations to achieve optimal performance. Nevertheless, our findings highlight their significant potential to become integral components of future graph foundation models. We provide new insights into the strengths and limitations of PSEs, contributing to the broader discourse on foundation models in graph learning.
