Lost in Distortion: Uncovering the Domain Gap Between Computer Vision and Brain Imaging -- A Study on Pretraining for Age Prediction
Yanteng Zhang, Songheng Li, Zeyu Shen, Qizhen Lan, Lipei Zhang, Yang Liu, Vince Calhoun
TL;DR
The paper investigates whether noisy, heterogeneous brain MRI data can benefit self-supervised pretraining for downstream brain age prediction, highlighting a domain gap between computer vision and clinical neuroimaging. It adopts a two-stage pipeline with a 3D MAE for pretraining on unlabeled MRIs, followed by fine-tuning on ADNI for brain age, and examines how data quality influences performance. Findings indicate that simply aggregating large, diverse scans does not guarantee gains; data quality and distribution alignment are crucial, and multimodal (MRI+PET) pretraining can offer improvements. The work argues for domain-aware curation and dataset design to develop robust, trustworthy brain-imaging foundation models in clinical settings.
Abstract
Large-scale brain imaging datasets provide unprecedented opportunities for developing domain foundation models through pretraining. However, unlike natural image datasets in computer vision, these neuroimaging data often exhibit high heterogeneity in quality, ranging from well-structured scans to severely distorted or incomplete brain volumes. This raises a fundamental question: can noise or low-quality scans contribute meaningfully to pretraining, or do they instead hinder model learning? In this study, we systematically explore the role of data quality level in pretraining and its impact on downstream tasks. Specifically, we perform pretraining on datasets with different quality levels and perform fine-tuning for brain age prediction on external cohorts. Our results show significant performance differences across quality levels, revealing both opportunities and limitations. We further discuss the gap between computer vision practices and clinical neuroimaging standards, emphasizing the necessity of domain-aware curation to ensure trusted and generalizable domain-specific foundation models.
