Region-Aware Reconstruction Strategy for Pre-training fMRI Foundation Model
Ruthwik Reddy Doodipala, Pankaj Pandey, Carolina Torres Rojas, Manob Jyoti Saikia, Ranganatha Sitaram
TL;DR
This work introduces region-aware ROI-guided masking for pretraining a functional MRI foundation model, integrating anatomical regions via the AAL3 atlas to preserve voxel-level information during reconstruction. Leveraging the NeuroSTORM architecture, the approach pretrains on full 4D fMRI volumes with region-based masking and then finetunes with a fixed encoder for downstream tasks. On ADHD-200, ROI masking yields a 4.23% improvement in healthy vs. ADHD classification accuracy over random masking, with cerebellar and limbic regions contributing most to reconstruction fidelity and discriminative representations. The method enhances interpretability and robustness, and the authors propose extending evaluation to additional datasets and developing region-aware loss functions and adaptive masking strategies for broader applicability in functional neuroimaging.
Abstract
The emergence of foundation models in neuroimaging is driven by the increasing availability of large-scale and heterogeneous brain imaging datasets. Recent advances in self-supervised learning, particularly reconstruction-based objectives, have demonstrated strong potential for pretraining models that generalize effectively across diverse downstream functional MRI (fMRI) tasks. In this study, we explore region-aware reconstruction strategies for a foundation model in resting-state fMRI, moving beyond approaches that rely on random region masking. Specifically, we introduce an ROI-guided masking strategy using the Automated Anatomical Labelling Atlas (AAL3), applied directly to full 4D fMRI volumes to selectively mask semantically coherent brain regions during self-supervised pretraining. Using the ADHD-200 dataset comprising 973 subjects with resting-state fMRI scans, we show that our method achieves a 4.23% improvement in classification accuracy for distinguishing healthy controls from individuals diagnosed with ADHD, compared to conventional random masking. Region-level attribution analysis reveals that brain volumes within the limbic region and cerebellum contribute most significantly to reconstruction fidelity and model representation. Our results demonstrate that masking anatomical regions during model pretraining not only enhances interpretability but also yields more robust and discriminative representations. In future work, we plan to extend this approach by evaluating it on additional neuroimaging datasets, and developing new loss functions explicitly derived from region-aware reconstruction objectives. These directions aim to further improve the robustness and interpretability of foundation models for functional neuroimaging.
