Table of Contents
Fetching ...

Diffusion-Guided Pretraining for Brain Graph Foundation Models

Xinxu Wei, Rong Zhou, Lifang He, Yu Zhang

TL;DR

This work proposes a unified diffusion-based pretraining framework that addresses both limitations of existing contrastive and masked autoencoder methods and enables topology-aware graph-level readout and node-level global reconstruction.

Abstract

With the growing interest in foundation models for brain signals, graph-based pretraining has emerged as a promising paradigm for learning transferable representations from connectome data. However, existing contrastive and masked autoencoder methods typically rely on naive random dropping or masking for augmentation, which is ill-suited for brain graphs and hypergraphs as it disrupts semantically meaningful connectivity patterns. Moreover, commonly used graph-level readout and reconstruction schemes fail to capture global structural information, limiting the robustness of learned representations. In this work, we propose a unified diffusion-based pretraining framework that addresses both limitations. First, diffusion is designed to guide structure-aware dropping and masking strategies, preserving brain graph semantics while maintaining effective pretraining diversity. Second, diffusion enables topology-aware graph-level readout and node-level global reconstruction by allowing graph embeddings and masked nodes to aggregate information from globally related regions. Extensive experiments across multiple neuroimaging datasets with over 25,000 subjects and 60,000 scans involving various mental disorders and brain atlases demonstrate consistent performance improvements.

Diffusion-Guided Pretraining for Brain Graph Foundation Models

TL;DR

This work proposes a unified diffusion-based pretraining framework that addresses both limitations of existing contrastive and masked autoencoder methods and enables topology-aware graph-level readout and node-level global reconstruction.

Abstract

With the growing interest in foundation models for brain signals, graph-based pretraining has emerged as a promising paradigm for learning transferable representations from connectome data. However, existing contrastive and masked autoencoder methods typically rely on naive random dropping or masking for augmentation, which is ill-suited for brain graphs and hypergraphs as it disrupts semantically meaningful connectivity patterns. Moreover, commonly used graph-level readout and reconstruction schemes fail to capture global structural information, limiting the robustness of learned representations. In this work, we propose a unified diffusion-based pretraining framework that addresses both limitations. First, diffusion is designed to guide structure-aware dropping and masking strategies, preserving brain graph semantics while maintaining effective pretraining diversity. Second, diffusion enables topology-aware graph-level readout and node-level global reconstruction by allowing graph embeddings and masked nodes to aggregate information from globally related regions. Extensive experiments across multiple neuroimaging datasets with over 25,000 subjects and 60,000 scans involving various mental disorders and brain atlases demonstrate consistent performance improvements.
Paper Structure (34 sections, 34 equations, 5 figures, 13 tables, 3 algorithms)

This paper contains 34 sections, 34 equations, 5 figures, 13 tables, 3 algorithms.

Figures (5)

  • Figure 1: The proposed diffusion-enhanced framework includes Contrastive Pretraining (left) and Masking Pretraining (right) for Graph/Hypergraph. In both pipelines, graph diffusion is applied for topology-aware augmentation, masking, diffusion-based readout and reconstruction.
  • Figure 2: Diffusion-based graph and hypergraph dropping and masking strategies for augmentation. The figure contrasts conventional random drop/mask strategies (✗) with the proposed diffusion-guided approaches (✓) for both graph and hypergraph pretraining. Stars indicate node importance, and edge numbers denote weights in $[0,1]$. In GCL, nodes and edges are dropped to generate augmented views, while in GMAE, nodes and their features are masked and reconstructed. Random strategies in GCL/GMAE may cause over- or under-perturbation, harming semantics or contrastiveness (blank nodes and dashed edges). Diffusion adaptively balances perturbation strength. Moreover, unlike local-only reconstruction in GMAE/HGMAE with edges but no arrows, diffusion enables global information aggregation across long-range nodes (bidirectional arrows).
  • Figure 3: We evaluate graph- and hypergraph-based pretraining (denoted by G and H, respectively) with and without diffusion on homogeneous and heterogeneous atlases, where SC100/200/300 denote Schaefer atlases with 100/200/300 ROIs. Diffusion-based pretraining consistently improves performance across atlas settings on ABIDE.
  • Figure 4: We compare diffusion-embedded architectures (GDT), diffusion-based pretraining (BrainGFM-Diff, Brain-HyperGFM), and their combination (GDT-Diff) on two datasets. Results highlight distinct behaviors of architectural diffusion and diffusion-based pretraining across data scales.
  • Figure 5: Each heatmap shows diffusion strengths between node pairs, where brighter colors indicate stronger connections. The first column depicts original intra-community connectivity, the second column applies diffusion within each community, and the third column applies diffusion across communities, illustrating both intra- and inter-community propagation effects.