Towards Generalizable Tumor Synthesis
Qi Chen, Xiaoxi Chen, Haorui Song, Zhiwei Xiong, Alan Yuille, Chen Wei, Zongwei Zhou
TL;DR
DiffTumor tackles cross-organ tumor synthesis by leveraging the observation that early-stage tumors ($<2$ cm) share CT features across organs, and introduces a three-stage framework: a 3D autoencoder trained on 9,262 unlabeled CT volumes to learn latent representations, a latent-space diffusion model conditioned on a tumor mask $m$ and the healthy latent $z0 extsubscript{healthy}$, and a segmentation model trained on synthetic tumors to augment real-data training. The approach uses large healthy-organ CT datasets to synthesize diverse tumors and evaluate generalization across organs and demographics, achieving notable improvements in Dice similarity coefficient (DSC) and sensitivity for real tumors across hospitals and backbones (e.g., +10.7% DSC across organs, +6.9% DSC and +16.4% sensitivity across demographics), with real-time synthesis at $T=4$. The work includes a Visual Turing Test indicating synthetic tumors approach realism, and ablation results showing substantial benefits from reduced annotations, accelerated synthesis, and enhanced early-tumor detection. Overall, DiffTumor offers a practical, data-efficient path to robust, cross-domain tumor segmentation via synthetic augmentation, potentially reducing annotation costs and improving clinical AI deployment.
Abstract
Tumor synthesis enables the creation of artificial tumors in medical images, facilitating the training of AI models for tumor detection and segmentation. However, success in tumor synthesis hinges on creating visually realistic tumors that are generalizable across multiple organs and, furthermore, the resulting AI models being capable of detecting real tumors in images sourced from different domains (e.g., hospitals). This paper made a progressive stride toward generalizable tumor synthesis by leveraging a critical observation: early-stage tumors (< 2cm) tend to have similar imaging characteristics in computed tomography (CT), whether they originate in the liver, pancreas, or kidneys. We have ascertained that generative AI models, e.g., Diffusion Models, can create realistic tumors generalized to a range of organs even when trained on a limited number of tumor examples from only one organ. Moreover, we have shown that AI models trained on these synthetic tumors can be generalized to detect and segment real tumors from CT volumes, encompassing a broad spectrum of patient demographics, imaging protocols, and healthcare facilities.
