Conditional Diffusion Models for Semantic 3D Brain MRI Synthesis
Zolnamar Dorjsembe, Hsing-Kuo Pao, Sodtavilan Odonchimed, Furen Xiao
TL;DR
The paper tackles data scarcity and privacy in brain MRI by introducing Med-DDPM, a conditional diffusion model that generates high-fidelity 3D brain MRIs guided by segmentation masks through channel-wise conditioning, i.e., ter tilde{x_t} = x_t c. It demonstrates that a pixel-wise L1 loss better preserves detail than L2 in this setting and achieves competitive, if not superior, fidelity and diversity compared with GAN baselines. Med-DDPM improves downstream tumor segmentation when used for data augmentation, attaining Dice scores approaching those with real data (e.g., Dice = 0.6675 on 1k real + 2k synthetic vs 0.6531 real alone) and enabling multimodal synthesis (T1, T1CE, T2, Flair) from masks. While memory demands are higher than those of GAN-based methods and some vascular and mass-effect cues require refinement, the method offers a principled path toward data-efficient, privacy-preserving medical image synthesis with practical augmentation and anonymization potential.
Abstract
Artificial intelligence (AI) in healthcare, especially in medical imaging, faces challenges due to data scarcity and privacy concerns. Addressing these, we introduce Med-DDPM, a diffusion model designed for 3D semantic brain MRI synthesis. This model effectively tackles data scarcity and privacy issues by integrating semantic conditioning. This involves the channel-wise concatenation of a conditioning image to the model input, enabling control in image generation. Med-DDPM demonstrates superior stability and performance compared to existing 3D brain imaging synthesis methods. It generates diverse, anatomically coherent images with high visual fidelity. In terms of dice score accuracy in the tumor segmentation task, Med-DDPM achieves 0.6207, close to the 0.6531 accuracy of real images, and outperforms baseline models. Combined with real images, it further increases segmentation accuracy to 0.6675, showing the potential of our proposed method for data augmentation. This model represents the first use of a diffusion model in 3D semantic brain MRI synthesis, producing high-quality images. Its semantic conditioning feature also shows potential for image anonymization in biomedical imaging, addressing data and privacy issues. We provide the code and model weights for Med-DDPM on our GitHub repository (https://github.com/mobaidoctor/med-ddpm/) to support reproducibility.
