A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning across Broad Atlases and Disorders
Xinxu Wei, Kanhao Zhao, Yong Jiao, Lifang He, Yu Zhang
TL;DR
BrainGFM introduces a graph-based brain foundation model pre-trained on a large, multi-atlas fMRI corpus to address data heterogeneity and limited cross-parcellation transfer. It fuses Graph Contrastive Learning and Graph Masked Autoencoders within a Graph Transformer backbone, augmented by atlas/parcellation and task/disorder prompts, plus meta-learning for few-shot and language prompts for zero-shot transfer. Across 25 disorders and 8 parcellations, BrainGFM demonstrates superior generalization and efficiency compared with time-series and Connectome/FC-based FMs, while enabling flexible downstream adaptation through graph and language prompts. This work offers a scalable, generalizable framework for multi-atlas neuroimaging analysis with broad translational potential in clinical neuroscience and brain-computer interfacing.
Abstract
As large language models (LLMs) continue to revolutionize AI research, there is a growing interest in building large-scale brain foundation models to advance neuroscience. While most existing brain foundation models are pre-trained on time-series signals or connectome features, we propose a novel graph-based pre-training paradigm for constructing a brain graph foundation model. In this paper, we introduce the Brain Graph Foundation Model, termed BrainGFM, a unified framework that leverages graph contrastive learning and graph masked autoencoders for large-scale fMRI-based pre-training. BrainGFM is pre-trained on a diverse mixture of brain atlases with varying parcellations, significantly expanding the pre-training corpus and enhancing the model's ability to generalize across heterogeneous fMRI-derived brain representations. To support efficient and versatile downstream transfer, we integrate both graph prompts and language prompts into the model design, enabling BrainGFM to flexibly adapt to a wide range of atlases, neurological and psychiatric disorders, and task settings. Furthermore, we employ meta-learning to optimize the graph prompts, facilitating strong generalization to previously unseen disorders under both few-shot and zero-shot learning conditions via language-guided prompting. BrainGFM is pre-trained on 27 neuroimaging datasets spanning 25 common neurological and psychiatric disorders, encompassing 2 types of brain atlases (functional and anatomical) across 8 widely-used parcellations, and covering over 25,000 subjects, 60,000 fMRI scans, and a total of 400,000 graph samples aggregated across all atlases and parcellations.
