Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models
Ben Finkelshtein, İsmail İlkan Ceylan, Michael Bronstein, Ron Levie
TL;DR
This work tackles node-level prediction across diverse graphs by enforcing a triple-symmetry inductive bias: node permutation equivariance, feature permutation invariance, and label permutation equivariance. It develops TSNet as a universal, symmetry-preserving building block and derives a practical TS-GNN recipe that upgrades standard GNNs into graph foundation models capable of zero-shot transfer to unseen graphs and feature/label spaces. The authors provide a complete characterization of triple-symmetric linear layers and prove universal approximation theorems, then validate on 29 node-classification benchmarks showing zero-shot gains and scaling with more training graphs. The results demonstrate that explicit symmetry handling yields transfer across domains and data regimes, marking a significant step toward robust graph foundation models.
Abstract
Graph machine learning architectures are typically tailored to specific tasks on specific datasets, which hinders their broader applicability. This has led to a new quest in graph machine learning: how to build graph foundation models capable of generalizing across arbitrary graphs and features? In this work, we present a recipe for designing graph foundation models for node-level tasks from first principles. The key ingredient underpinning our study is a systematic investigation of the symmetries that a graph foundation model must respect. In a nutshell, we argue that label permutation-equivariance alongside feature permutation-invariance are necessary in addition to the common node permutation-equivariance on each local neighborhood of the graph. To this end, we first characterize the space of linear transformations that are equivariant to permutations of nodes and labels, and invariant to permutations of features. We then prove that the resulting network is a universal approximator on multisets that respect the aforementioned symmetries. Our recipe uses such layers on the multiset of features induced by the local neighborhood of the graph to obtain a class of graph foundation models for node property prediction. We validate our approach through extensive experiments on 29 real-world node classification datasets, demonstrating both strong zero-shot empirical performance and consistent improvement as the number of training graphs increases.
