How we won BraTS 2023 Adult Glioma challenge? Just faking it! Enhanced Synthetic Data Augmentation and Model Ensemble for brain tumour segmentation
André Ferreira, Naida Solak, Jianning Li, Philipp Dammann, Jens Kleesiek, Victor Alves, Jan Egger
TL;DR
<3-5 sentence high-level summary> This work tackles data scarcity in adult brain tumor segmentation for BraTS 2023 by introducing two non-traditional data-augmentation pipelines: registration-based synthesis to massively expand training samples, and GliGAN-based synthetic tumors inserted into healthy brains. It combines CNN-based nnU-Net baselines with a transformer-based Swin UNETR and a 2021 BraTS-winner-inspired network, forming 9 models trained with 5-fold cross-validation and ensembled via probability-map averaging. The approach achieves competitive lesion-based metrics on the validation set (Dice ~0.90 for whole tumor, ~0.867 TC, ~0.851 ET; HD95 ~14–18), illustrating that synthetic data and ensemble strategies can compensate for limited labeled medical data while handling new BraTS2023 evaluation nuances. Limitations include a GliGAN size constraint and the need for post-processing thresholds, with future work focusing on larger generators and broader application of synthetic data to other networks.
Abstract
Deep Learning is the state-of-the-art technology for segmenting brain tumours. However, this requires a lot of high-quality data, which is difficult to obtain, especially in the medical field. Therefore, our solutions address this problem by using unconventional mechanisms for data augmentation. Generative adversarial networks and registration are used to massively increase the amount of available samples for training three different deep learning models for brain tumour segmentation, the first task of the BraTS2023 challenge. The first model is the standard nnU-Net, the second is the Swin UNETR and the third is the winning solution of the BraTS 2021 Challenge. The entire pipeline is built on the nnU-Net implementation, except for the generation of the synthetic data. The use of convolutional algorithms and transformers is able to fill each other's knowledge gaps. Using the new metric, our best solution achieves the dice results 0.9005, 0.8673, 0.8509 and HD95 14.940, 14.467, 17.699 (whole tumour, tumour core and enhancing tumour) in the validation set.
