Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ Segmentation
Yuran Wang, Zhijing Wan, Yansheng Qiu, Zheng Wang
TL;DR
This work tackles the scarcity of high-quality unlabeled data in SSL for medical imaging by presenting Locality-Aware Diffusion (Lad), a three-phase framework that generates high-fidelity 3D abdominal CT volumes. Lad constructs a locality-focused latent space using VQ-GAN, fits a diffusion model in that space guided by anatomical priors extracted from predicted organ masks, and samples with augmented locality conditions to produce large diverse synthetic datasets. Key contributions include a Locality Loss that emphasizes fine-grained abdominal structures, a dual-content/structure Condition Extractor based on Betti-topology features, and locality-conditioned sampling with classifier-free guidance. Experimental results on AbdomenCT-1K and TotalSegmentator show Lad achieves state-of-the-art realism and diversity, and substantially improves self-supervised organ segmentation performance, especially for small organs like the pancreas and spleen, highlighting the practical impact of synthetic data for SSL in medical imaging.
Abstract
In the realm of medical image analysis, self-supervised learning (SSL) techniques have emerged to alleviate labeling demands, while still facing the challenge of training data scarcity owing to escalating resource requirements and privacy constraints. Numerous efforts employ generative models to generate high-fidelity, unlabeled 3D volumes across diverse modalities and anatomical regions. However, the intricate and indistinguishable anatomical structures within the abdomen pose a unique challenge to abdominal CT volume generation compared to other anatomical regions. To address the overlooked challenge, we introduce the Locality-Aware Diffusion (Lad), a novel method tailored for exquisite 3D abdominal CT volume generation. We design a locality loss to refine crucial anatomical regions and devise a condition extractor to integrate abdominal priori into generation, thereby enabling the generation of large quantities of high-quality abdominal CT volumes essential for SSL tasks without the need for additional data such as labels or radiology reports. Volumes generated through our method demonstrate remarkable fidelity in reproducing abdominal structures, achieving a decrease in FID score from 0.0034 to 0.0002 on AbdomenCT-1K dataset, closely mirroring authentic data and surpassing current methods. Extensive experiments demonstrate the effectiveness of our method in self-supervised organ segmentation tasks, resulting in an improvement in mean Dice scores on two abdominal datasets effectively. These results underscore the potential of synthetic data to advance self-supervised learning in medical image analysis.
