Brain tumor segmentation using synthetic MR images -- A comparison of GANs and diffusion models
Muhammad Usman Akbar, Måns Larsson, Anders Eklund
TL;DR
This study addresses privacy-driven data-sharing barriers in medical imaging by evaluating whether synthetic brain MRI data can effectively train segmentation models. It compares four 2D GANs (Progressive GAN, StyleGAN1-3) and a diffusion model to generate 5-channel brain MRI data (four MR sequences plus tumor annotation) and assesses performance using U-Net and Swin-transformer segmentation networks on BraTS 2020/2021. The findings show that segmentation models trained on synthetic data achieve Dice scores at 80-90% of those trained on real data, with diffusion models more prone to memorization on small datasets, while StyleGANs offer competitive performance; overall, synthetic data sharing is viable but requires careful handling of memorization and model choice. The work provides public access to the generated synthetic images and trained models and highlights the need for improved evaluation metrics beyond FID/IS to accurately reflect clinical segmentation performance.
Abstract
Large annotated datasets are required for training deep learning models, but in medical imaging data sharing is often complicated due to ethics, anonymization and data protection legislation. Generative AI models, such as generative adversarial networks (GANs) and diffusion models, can today produce very realistic synthetic images, and can potentially facilitate data sharing. However, in order to share synthetic medical images it must first be demonstrated that they can be used for training different networks with acceptable performance. Here, we therefore comprehensively evaluate four GANs (progressive GAN, StyleGAN 1-3) and a diffusion model for the task of brain tumor segmentation (using two segmentation networks, U-Net and a Swin transformer). Our results show that segmentation networks trained on synthetic images reach Dice scores that are 80% - 90% of Dice scores when training with real images, but that memorization of the training images can be a problem for diffusion models if the original dataset is too small. Our conclusion is that sharing synthetic medical images is a viable option to sharing real images, but that further work is required. The trained generative models and the generated synthetic images are shared on AIDA data hub
