GraphPFN: A Prior-Data Fitted Graph Foundation Model
Dmitry Eremeev, Oleg Platonov, Gleb Bazhenov, Artem Babenko, Liudmila Prokhorenkova
TL;DR
GraphPFN tackles the feasibility gap of graph foundation models by pretraining a PFN-based graph model on a carefully designed synthetic graph prior and augmenting a tabular backbone with graph adapters. It introduces a two-stage pretraining regime over $2{,}240{,}000$ synthetic graphs, spanning roughly $7$ days on eight NVIDIA A100 GPUs, and uses a joint objective that combines PFN supervision with masked graph modeling. The approach achieves strong in-context learning on GraphLand and competitive performance on classic graphs, while finetuning often yields state-of-the-art results, demonstrating the viability of PFN-based graph foundation models. The work highlights a scalable pathway to generalizable graph models by separating graph-aware pretraining from task-specific finetuning and shows the promise of synthetic priors in capturing realistic graph structure and attributes.
Abstract
Graph foundation models face several fundamental challenges including transferability across datasets and data scarcity, which calls into question the very feasibility of graph foundation models. However, despite similar challenges, the tabular domain has recently witnessed the emergence of the first successful foundation models such as TabPFNv2 and LimiX. Many of these models are based on the prior-data fitted networks (PFN) framework, in which models are pretrained on carefully designed synthetic datasets to make predictions in an in-context learning setting. Recently, G2T-FM has made the first step towards adopting PFNs for graphs, yet it is limited to hand-crafted features and was never pretrained on graph data. In this work, we make the next step by proposing GraphPFN, a PFN-based model designed and pretrained specifically for graph node-level tasks. Following the PFN framework, we first design a prior distribution of synthetic attributed graphs by using a novel combination of multi-level stochastic block models and a preferential attachment process for structure generation and graph-aware structured causal models for attribute generation. Then, we augment the tabular foundation model LimiX with attention-based graph neighborhood aggregation layers and train it on synthetic graphs sampled from our prior. On diverse real-world graph datasets with node-level tasks, GraphPFN shows strong in-context learning performance and achieves state-of-the-art results after finetuning, outperforming both G2T-FM and task-specific GNNs trained from scratch on most datasets. More broadly, GraphPFN shows the potential of PFN-based models for building graph foundation models.
