Structure-Aware Stylized Image Synthesis for Robust Medical Image Segmentation
Jie Bao, Zhixin Zhou, Wen Jung Li, Rui Luo
TL;DR
This work tackles domain shift in medical image segmentation by combining diffusion-based style transfer with a Structure-Preserving Network (SPN) to produce structure-aware stylizations. The OSASIS framework encodes semantic content via a Diffusion Autoencoder, preserves geometry with SPN, and feeds stylized images into segmentation models, guided by style-transfer and segmentation losses that include CLIP-based adversarial objectives and cycle-consistency. The approach demonstrates improved robustness and accuracy on colonoscopy polyp and skin lesion segmentation tasks across unseen target domains, outperforming baseline models without style transfer. Importantly, OSASIS is compatible with existing segmentation architectures and does not require access to the target domain during training, offering practical value for diverse clinical settings. Future directions include developing direct metrics for segmentation quality after style transfer and enabling end-to-end stylized training for medical image segmentation.
Abstract
Accurate medical image segmentation is essential for effective diagnosis and treatment planning but is often challenged by domain shifts caused by variations in imaging devices, acquisition conditions, and patient-specific attributes. Traditional domain generalization methods typically require inclusion of parts of the test domain within the training set, which is not always feasible in clinical settings with limited diverse data. Additionally, although diffusion models have demonstrated strong capabilities in image generation and style transfer, they often fail to preserve the critical structural information necessary for precise medical analysis. To address these issues, we propose a novel medical image segmentation method that combines diffusion models and Structure-Preserving Network for structure-aware one-shot image stylization. Our approach effectively mitigates domain shifts by transforming images from various sources into a consistent style while maintaining the location, size, and shape of lesions. This ensures robust and accurate segmentation even when the target domain is absent from the training data. Experimental evaluations on colonoscopy polyp segmentation and skin lesion segmentation datasets show that our method enhances the robustness and accuracy of segmentation models, achieving superior performance metrics compared to baseline models without style transfer. This structure-aware stylization framework offers a practical solution for improving medical image segmentation across diverse domains, facilitating more reliable clinical diagnoses.
