3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Regimes
Aghiles Kebaili, Jérôme Lapuyade-Lahorgue, Pierre Vera, Su Ruan
TL;DR
This work tackles the data scarcity barrier in MRI tumor segmentation by introducing a slice-based latent diffusion model (SBLDM) that jointly synthesizes 3D MRI volumes and corresponding masks. By encoding each 2D slice pair $(x_i, m_i)$ into a latent space with a positional cue $l(i)$ and performing diffusion in that latent space, the method achieves data-efficient generation while enabling explicit control over tumor size, shape, and position through a conditioning vector $c$ processed by an MLP $\tau$. Evaluations on BRATS2022 show that SBLDM delivers high image quality (PSNR/SSIM) and substantially improves segmentation performance when used for augmentation, outpacing GAN-based methods and standard 3D diffusion approaches, especially when sampling time is accelerated with DDIM. The approach provides a practical pathway to robust, privacy-preserving data augmentation for medical imaging in data-scarce clinical settings, with potential for extension to multi-modal MRI synthesis.
Abstract
Despite the increasing use of deep learning in medical image segmentation, the limited availability of annotated training data remains a major challenge due to the time-consuming data acquisition and privacy regulations. In the context of segmentation tasks, providing both medical images and their corresponding target masks is essential. However, conventional data augmentation approaches mainly focus on image synthesis. In this study, we propose a novel slice-based latent diffusion architecture designed to address the complexities of volumetric data generation in a slice-by-slice fashion. This approach extends the joint distribution modeling of medical images and their associated masks, allowing a simultaneous generation of both under data-scarce regimes. Our approach mitigates the computational complexity and memory expensiveness typically associated with diffusion models. Furthermore, our architecture can be conditioned by tumor characteristics, including size, shape, and relative position, thereby providing a diverse range of tumor variations. Experiments on a segmentation task using the BRATS2022 confirm the effectiveness of the synthesized volumes and masks for data augmentation.
