How We Won BraTS-SSA 2025: Brain Tumor Segmentation in the Sub-Saharan African Population Using Segmentation-Aware Data Augmentation and Model Ensembling
Claudia Takyi Ankomah, Livingstone Eli Ayivor, Ireneaus Nyame, Leslie Wambo, Patrick Yeboah Bonsu, Aondona Moses Iorumbur, Raymond Confidence, Toufiq Musah
TL;DR
This study tackles brain tumor segmentation in Sub-Saharan Africa using BraTS-Africa data, addressing data scarcity and distributional gaps by pairing segmentation-aware offline data augmentation with a diverse model ensemble. It evaluates three architectures—SegMamba, MedNeXt, and Residual-Encoder U-NET—and their ensembles under a 5-fold cross-validation regime, using Dice-based and surface-distance metrics. The results show MedNeXt achieves the best single-model performance at 1000 epochs (LSD ~0.865, NSD ~0.810), while the ensemble trained for 1000 epochs yields the most balanced subregion segmentation (best average NSD ~0.815 and high LSD ~0.867), demonstrating the complementary strengths of architectural diversity and targeted augmentation. The findings highlight that segmentation-aware augmentation and ensembling can substantially improve generalization on underrepresented datasets, with practical implications for deploying robust brain tumor segmentation in SSA clinical contexts. Overall, the work provides a scalable approach to enhance robustness and accuracy in data-constrained settings through coordinated augmentation and model fusion.
Abstract
Brain tumors, particularly gliomas, pose significant chall-enges due to their complex growth patterns, infiltrative nature, and the variability in brain structure across individuals, which makes accurate diagnosis and monitoring difficult. Deep learning models have been developed to accurately delineate these tumors. However, most of these models were trained on relatively homogenous high-resource datasets, limiting their robustness when deployed in underserved regions. In this study, we performed segmentation-aware offline data augmentation on the BraTS-Africa dataset to increase the data sample size and diversity to enhance generalization. We further constructed an ensemble of three distinct architectures, MedNeXt, SegMamba, and Residual-Encoder U-Net, to leverage their complementary strengths. Our best-performing model, MedNeXt, was trained on 1000 epochs and achieved the highest average lesion-wise dice and normalized surface distance scores of 0.86 and 0.81 respectively. However, the ensemble model trained for 500 epochs produced the most balanced segmentation performance across the tumour subregions. This work demonstrates that a combination of advanced augmentation and model ensembling can improve segmentation accuracy and robustness on diverse and underrepresented datasets. Code available at: https://github.com/SPARK-Academy-2025/SPARK-2025/tree/main/SPARK2025_BraTs_MODELS/SPARK_NeuroAshanti
