ContourDiff: Unpaired Medical Image Translation with Structural Consistency
Yuwen Chen, Nicholas Konz, Hanxue Gu, Haoyu Dong, Yaqian Chen, Lin Li, Jisoo Lee, Maciej A. Mazurowski
TL;DR
ContourDiff introduces a contour-guided diffusion framework for unpaired medical image translation, enforcing anatomical fidelity by conditioning the denoiser on input-domain contours and adjacent-slice context. The novel Spatially Coherent Guided Diffusion (SCGD) enables slice-to-slice spatial consistency, enabling high-quality CT-to-MRI translations that preserve realistic anatomy and improve downstream segmentation. Zero-shot capability is demonstrated by translating unseen anatomical regions and even different MRI contrasts without retraining, with strong quantitative and qualitative results across lumbar, hip & thigh, and liver datasets. The work offers practical gains for cross-modality training of segmentation models and medical image harmonization, supported by thorough ablations, robustness analyses, and efficiency metrics.
Abstract
Accurately translating medical images between different modalities, such as Computed Tomography (CT) to Magnetic Resonance Imaging (MRI), has numerous downstream clinical and machine learning applications. While several methods have been proposed to achieve this, they often prioritize perceptual quality with respect to output domain features over preserving anatomical fidelity. However, maintaining anatomy during translation is essential for many tasks, e.g., when leveraging masks from the input domain to develop a segmentation model with images translated to the output domain. To address these challenges, we propose ContourDiff with Spatially Coherent Guided Diffusion (SCGD), a novel framework that leverages domain-invariant anatomical contour representations of images. These representations are simple to extract from images, yet form precise spatial constraints on their anatomical content. We introduce a diffusion model that converts contour representations of images from arbitrary input domains into images in the output domain of interest. By applying the contour as a constraint at every diffusion sampling step, we ensure the preservation of anatomical content. We evaluate our method on challenging lumbar spine and hip-and-thigh CT-to-MRI translation tasks, via (1) the performance of segmentation models trained on translated images applied to real MRIs, and (2) the foreground FID and KID of translated images with respect to real MRIs. Our method outperforms other unpaired image translation methods by a significant margin across almost all metrics and scenarios. Moreover, it achieves this without the need to access any input domain information during training and we further verify its zero-shot capability, showing that a model trained on one anatomical region can be directly applied to unseen regions without retraining (GitHub: https://github.com/mazurowski-lab/ContourDiff).
