Graph Foundation Models: A Comprehensive Survey
Zehong Wang, Zheyuan Liu, Tianyi Ma, Jiazheng Li, Zheyuan Zhang, Xingbo Fu, Yiyang Li, Zhengqing Yuan, Wei Song, Yijun Ma, Qingkai Zeng, Xiusi Chen, Jianan Zhao, Jundong Li, Meng Jiang, Pietro Lio, Nitesh Chawla, Chuxu Zhang, Yanfang Ye
TL;DR
Graph Foundation Models (GFMs) extend foundation-model principles to non-Euclidean, relational graph data by unifying backbone architectures, pretraining strategies, and adaptation mechanisms in a pretrain-then-adapt framework. The survey introduces a three-way taxonomy (universal, domain-specific, task-specific GFMs) and a modular framework that supports graph neural networks, language models, and their hybrids, enabling cross-domain transfer and open-ended graph reasoning. It analyzes theoretical foundations (transferability, emergent capabilities), benchmarks, and key challenges like structural alignment, heterogeneity, scalability, and evaluation, and outlines future directions toward scalable, multimodal, and theory-grounded GFMs. Resource coverage includes a public repository with datasets, baselines, and pretrained models to foster reproducibility. Overall, GFMs are positioned as foundational infrastructure for scalable, general-purpose reasoning over structured data, with broad implications for molecules, knowledge graphs, biology, and beyond.
Abstract
Graph-structured data pervades domains such as social networks, biological systems, knowledge graphs, and recommender systems. While foundation models have transformed natural language processing, vision, and multimodal learning through large-scale pretraining and generalization, extending these capabilities to graphs -- characterized by non-Euclidean structures and complex relational semantics -- poses unique challenges and opens new opportunities. To this end, Graph Foundation Models (GFMs) aim to bring scalable, general-purpose intelligence to structured data, enabling broad transfer across graph-centric tasks and domains. This survey provides a comprehensive overview of GFMs, unifying diverse efforts under a modular framework comprising three key components: backbone architectures, pretraining strategies, and adaptation mechanisms. We categorize GFMs by their generalization scope -- universal, task-specific, and domain-specific -- and review representative methods, key innovations, and theoretical insights within each category. Beyond methodology, we examine theoretical foundations including transferability and emergent capabilities, and highlight key challenges such as structural alignment, heterogeneity, scalability, and evaluation. Positioned at the intersection of graph learning and general-purpose AI, GFMs are poised to become foundational infrastructure for open-ended reasoning over structured data. This survey consolidates current progress and outlines future directions to guide research in this rapidly evolving field. Resources are available at https://github.com/Zehong-Wang/Awesome-Foundation-Models-on-Graphs.
