GraphFM: A Comprehensive Benchmark for Graph Foundation Model

Yuhao Xu; Xinqi Liu; Keyu Duan; Yi Fang; Yu-Neng Chuang; Daochen Zha; Qiaoyu Tan

GraphFM: A Comprehensive Benchmark for Graph Foundation Model

Yuhao Xu, Xinqi Liu, Keyu Duan, Yi Fang, Yu-Neng Chuang, Daochen Zha, Qiaoyu Tan

TL;DR

GraphFM tackles the challenge that Graph Foundation Models trained with Graph Self-Supervised Learning often generalize poorly across tasks, scale poorly to large graphs, and rely on simplistic stopping criteria. It introduces a unified, open benchmark that evaluates eight GSSL methods on six datasets under both full-batch and mini-batch training, across node classification, link prediction, and node clustering, with consistent processing and hyperparameter search. The results reveal nuanced strengths: generative and contrastive approaches differ in cross-task performance and scalability, and early stopping criteria can strongly influence generalization, especially on large data. By enabling fair comparison and providing actionable insights, GraphFM aims to guide future research toward better homogenization, scalable training, and robust stopping strategies for Graph Foundation Models.

Abstract

Foundation Models (FMs) serve as a general class for the development of artificial intelligence systems, offering broad potential for generalization across a spectrum of downstream tasks. Despite extensive research into self-supervised learning as the cornerstone of FMs, several outstanding issues persist in Graph Foundation Models that rely on graph self-supervised learning, namely: 1) Homogenization. The extent of generalization capability on downstream tasks remains unclear. 2) Scalability. It is unknown how effectively these models can scale to large datasets. 3) Efficiency. The training time and memory usage of these models require evaluation. 4) Training Stop Criteria. Determining the optimal stopping strategy for pre-training across multiple tasks to maximize performance on downstream tasks. To address these questions, we have constructed a rigorous benchmark that thoroughly analyzes and studies the generalization and scalability of self-supervised Graph Neural Network (GNN) models. Regarding generalization, we have implemented and compared the performance of various self-supervised GNN models, trained to generate node representations, across tasks such as node classification, link prediction, and node clustering. For scalability, we have compared the performance of various models after training using full-batch and mini-batch strategies. Additionally, we have assessed the training efficiency of these models by conducting experiments to test their GPU memory usage and throughput. Through these experiments, we aim to provide insights to motivate future research. The code for this benchmark is publicly available at https://github.com/NYUSHCS/GraphFM.

GraphFM: A Comprehensive Benchmark for Graph Foundation Model

TL;DR

Abstract

Paper Structure (31 sections, 10 equations, 4 figures, 18 tables)

This paper contains 31 sections, 10 equations, 4 figures, 18 tables.

Introduction
Preliminaries
Benchmark Design
Dataset and Implementations
Research Questions
Experiments Results and Analyses
Performance Comparison in Node Classification (RQ1)
Performance Comparison in Link Prediction and Node Clustering (RQ2)
Performance and Efficiency Comparison in Large-Scale Dataset (RQ3)
Performance Using Alternative Early Stopping Criterion (RQ4)
Future Direrctions
Conclusion
Additional Details on Benchmark
Datasets
GSSL models
...and 16 more sections

Figures (4)

Figure 1: An overview of GraphFM. We perform a comprehensive benchmark of state-of-the-art self-supervised GNN models through four key aspects: dataset scale, training strategies, GSSL methods for Graph FMs, and adaptability to different downstream tasks.
Figure 2: Link Prediction results on Cora, Citeseer, Pubmed based on full batch training.
Figure 3: Time and space consumption of different methods and training strategy on Pubmed.
Figure 4: Node Clustering results on Cora, Citeseer, Pubmed based on full batch training.

GraphFM: A Comprehensive Benchmark for Graph Foundation Model

TL;DR

Abstract

GraphFM: A Comprehensive Benchmark for Graph Foundation Model

Authors

TL;DR

Abstract

Table of Contents

Figures (4)