Plasticine: A Traceable Diffusion Model for Medical Image Translation
Tianyang Zhanng, Xinxing Cheng, Jun Cheng, Shaoming Zheng, He Zhao, Huazhu Fu, Alejandro F Frangi, Jiang Liu, Jinming Duan
TL;DR
Plasticine tackles the lack of traceability in medical image translation by embedding intensity translation and spatial transformations inside a diffusion framework, enabling pixel-level correspondences between source and translated images. It introduces a diffusion-based intensity translator and a cross-modality spatial transformation module that yields diffeomorphic deformations and interpretable spatial changes, preserving topology. Across retinal OCT, chest MRI-CT, and cardiac MRI tasks, Plasticine demonstrates superior traceability via segmentation metrics and competitive image synthesis against GAN and diffusion baselines, with clinical user studies supporting its practical utility. The work also highlights limitations, including dependence on precomputed structure maps and future plans for extending to 3D data.
Abstract
Domain gaps arising from variations in imaging devices and population distributions pose significant challenges for machine learning in medical image analysis. Existing image-to-image translation methods primarily aim to learn mappings between domains, often generating diverse synthetic data with variations in anatomical scale and shape, but they usually overlook spatial correspondence during the translation process. For clinical applications, traceability, defined as the ability to provide pixel-level correspondences between original and translated images, is equally important. This property enhances clinical interpretability but has been largely overlooked in previous approaches. To address this gap, we propose Plasticine, which is, to the best of our knowledge, the first end-to-end image-to-image translation framework explicitly designed with traceability as a core objective. Our method combines intensity translation and spatial transformation within a denoising diffusion framework. This design enables the generation of synthetic images with interpretable intensity transitions and spatially coherent deformations, supporting pixel-wise traceability throughout the translation process.
