AMAES: Augmented Masked Autoencoder Pretraining on Public Brain MRI Data for 3D-Native Segmentation
Asbjørn Munk, Jakob Ambsdorf, Sebastian Llambias, Mads Nielsen
TL;DR
The paper tackles the lack of large public unlabeled data for 3D brain MRI segmentation by introducing BRAINS-45K, the largest public brain MRI collection to date for pretraining. It proposes AMAES, a memory-efficient augmentation-reversal masked autoencoder framework for 3D segmentation that uses a lightweight decoder and eschews pretraining skip connections, with pretraining on BRAINS-45K and finetuning on three downstream tasks. Results show that AMAES improves Dice scores across BraTS21, ISLES22, and WMH, and often surpasses SwinUNETR baselines, including in out-of-domain settings, while reducing memory and runtime. The work provides a practical, scalable pathway toward large-scale, self-supervised 3D medical segmentation and includes code and BRAINS-45K for reproducibility, enabling broader methodological research and clinical translation.
Abstract
This study investigates the impact of self-supervised pretraining of 3D semantic segmentation models on a large-scale, domain-specific dataset. We introduce BRAINS-45K, a dataset of 44,756 brain MRI volumes from public sources, the largest public dataset available, and revisit a number of design choices for pretraining modern segmentation architectures by simplifying and optimizing state-of-the-art methods, and combining them with a novel augmentation strategy. The resulting AMAES framework is based on masked-image-modeling and intensity-based augmentation reversal and balances memory usage, runtime, and finetuning performance. Using the popular U-Net and the recent MedNeXt architecture as backbones, we evaluate the effect of pretraining on three challenging downstream tasks, covering single-sequence, low-resource settings, and out-of-domain generalization. The results highlight that pretraining on the proposed dataset with AMAES significantly improves segmentation performance in the majority of evaluated cases, and that it is beneficial to pretrain the model with augmentations, despite pretraing on a large-scale dataset. Code and model checkpoints for reproducing results, as well as the BRAINS-45K dataset are available at \url{https://github.com/asbjrnmunk/amaes}.
