Table of Contents
Fetching ...

Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models

Ben Finkelshtein, İsmail İlkan Ceylan, Michael Bronstein, Ron Levie

TL;DR

This work tackles node-level prediction across diverse graphs by enforcing a triple-symmetry inductive bias: node permutation equivariance, feature permutation invariance, and label permutation equivariance. It develops TSNet as a universal, symmetry-preserving building block and derives a practical TS-GNN recipe that upgrades standard GNNs into graph foundation models capable of zero-shot transfer to unseen graphs and feature/label spaces. The authors provide a complete characterization of triple-symmetric linear layers and prove universal approximation theorems, then validate on 29 node-classification benchmarks showing zero-shot gains and scaling with more training graphs. The results demonstrate that explicit symmetry handling yields transfer across domains and data regimes, marking a significant step toward robust graph foundation models.

Abstract

Graph machine learning architectures are typically tailored to specific tasks on specific datasets, which hinders their broader applicability. This has led to a new quest in graph machine learning: how to build graph foundation models capable of generalizing across arbitrary graphs and features? In this work, we present a recipe for designing graph foundation models for node-level tasks from first principles. The key ingredient underpinning our study is a systematic investigation of the symmetries that a graph foundation model must respect. In a nutshell, we argue that label permutation-equivariance alongside feature permutation-invariance are necessary in addition to the common node permutation-equivariance on each local neighborhood of the graph. To this end, we first characterize the space of linear transformations that are equivariant to permutations of nodes and labels, and invariant to permutations of features. We then prove that the resulting network is a universal approximator on multisets that respect the aforementioned symmetries. Our recipe uses such layers on the multiset of features induced by the local neighborhood of the graph to obtain a class of graph foundation models for node property prediction. We validate our approach through extensive experiments on 29 real-world node classification datasets, demonstrating both strong zero-shot empirical performance and consistent improvement as the number of training graphs increases.

Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models

TL;DR

This work tackles node-level prediction across diverse graphs by enforcing a triple-symmetry inductive bias: node permutation equivariance, feature permutation invariance, and label permutation equivariance. It develops TSNet as a universal, symmetry-preserving building block and derives a practical TS-GNN recipe that upgrades standard GNNs into graph foundation models capable of zero-shot transfer to unseen graphs and feature/label spaces. The authors provide a complete characterization of triple-symmetric linear layers and prove universal approximation theorems, then validate on 29 node-classification benchmarks showing zero-shot gains and scaling with more training graphs. The results demonstrate that explicit symmetry handling yields transfer across domains and data regimes, marking a significant step toward robust graph foundation models.

Abstract

Graph machine learning architectures are typically tailored to specific tasks on specific datasets, which hinders their broader applicability. This has led to a new quest in graph machine learning: how to build graph foundation models capable of generalizing across arbitrary graphs and features? In this work, we present a recipe for designing graph foundation models for node-level tasks from first principles. The key ingredient underpinning our study is a systematic investigation of the symmetries that a graph foundation model must respect. In a nutshell, we argue that label permutation-equivariance alongside feature permutation-invariance are necessary in addition to the common node permutation-equivariance on each local neighborhood of the graph. To this end, we first characterize the space of linear transformations that are equivariant to permutations of nodes and labels, and invariant to permutations of features. We then prove that the resulting network is a universal approximator on multisets that respect the aforementioned symmetries. Our recipe uses such layers on the multiset of features induced by the local neighborhood of the graph to obtain a class of graph foundation models for node property prediction. We validate our approach through extensive experiments on 29 real-world node classification datasets, demonstrating both strong zero-shot empirical performance and consistent improvement as the number of training graphs increases.

Paper Structure

This paper contains 44 sections, 22 theorems, 203 equations, 4 figures, 11 tables.

Key Result

Proposition 1

A linear function of the form $T=(T_1,T_2)\colon {\mathbb{R}}^{N \times F\times K_1}\times{\mathbb{R}}^{N \times C\times K_1} \to {\mathbb{R}}^{N \times F\times K_2}\times{\mathbb{R}}^{N \times C\times K_2}$ is $(S_N \times S_F \times S_C)$-equivariant if and only if there exist ${\bm{\Lambda}}^{(1) where for every $i\in[6]$, we have ${\bm{X}}^{(i)}_{:,:,k_2} = \sum_{k_1=0}^{K_1} {\bm{X}}_{:,:,k_1

Figures (4)

  • Figure 1: The input to a triple-symmetry network is a feature matrix ${\bm{X}}$ and (possibly masked) label matrix ${\bm{Y}}$. The encoder must be equivariant to element-wise permutations $\sigma_N\in S_N$ (affecting the rows of both ${\bm{X}}$ and ${\bm{Y}}$), equivariant to class label permutations $\sigma_C\in S_C$ (affecting the columns of ${\bm{Y}}$), and invariant to feature permutations $\sigma_F\in S_F$ (affecting the columns of ${\bm{X}}$).
  • Figure 2: An illustration of the feature and label embeddings across the layers of our triple-symmetric graph neural network architecture. The architecture is composed of feature-, label- and node-equivariant aggregation layers and a final feature-invariant projection layer.
  • Figure 3: Average zero-shot accuracy of TS-Mean and GraphAny across 20 datasets as a function of the number of training graphs.
  • Figure 4: Flow chart illustrating the main steps in the proof of \ref{['lem:TSNet_from(11)(1F_1)', 'lem:universality_equivariance']}, which are the key lemmas used in establishing \ref{['thm:universality']}. Each box represents a logical step, and arrows indicate the lemmas, arguments, and dependencies connecting them. Blue and green boxes denote steps that extend techniques similar to, or inspired by, segol2019universal and maron2020learningsetssymmetricelements. The proof steps of \ref{['lem:TSNet_from(11)(1F_1)']} and \ref{['lem:universality_equivariance']} are labeled (1--5) and (i--iv), respectively, with the final step (iii,iv) highlighted with a thicker outline.

Theorems & Definitions (62)

  • Proposition 1
  • Lemma 1
  • Theorem 4.1
  • Definition 1: Group Homomorphism
  • Definition 2: Representations
  • Definition 3: Group Action
  • Definition 4: Orbit
  • Definition 5: $G$-invariant Subspace
  • Definition 6: Subrepresentation
  • Definition 7: Irreducible Representation
  • ...and 52 more