Table of Contents
Fetching ...

Structure-Aware Stylized Image Synthesis for Robust Medical Image Segmentation

Jie Bao, Zhixin Zhou, Wen Jung Li, Rui Luo

TL;DR

This work tackles domain shift in medical image segmentation by combining diffusion-based style transfer with a Structure-Preserving Network (SPN) to produce structure-aware stylizations. The OSASIS framework encodes semantic content via a Diffusion Autoencoder, preserves geometry with SPN, and feeds stylized images into segmentation models, guided by style-transfer and segmentation losses that include CLIP-based adversarial objectives and cycle-consistency. The approach demonstrates improved robustness and accuracy on colonoscopy polyp and skin lesion segmentation tasks across unseen target domains, outperforming baseline models without style transfer. Importantly, OSASIS is compatible with existing segmentation architectures and does not require access to the target domain during training, offering practical value for diverse clinical settings. Future directions include developing direct metrics for segmentation quality after style transfer and enabling end-to-end stylized training for medical image segmentation.

Abstract

Accurate medical image segmentation is essential for effective diagnosis and treatment planning but is often challenged by domain shifts caused by variations in imaging devices, acquisition conditions, and patient-specific attributes. Traditional domain generalization methods typically require inclusion of parts of the test domain within the training set, which is not always feasible in clinical settings with limited diverse data. Additionally, although diffusion models have demonstrated strong capabilities in image generation and style transfer, they often fail to preserve the critical structural information necessary for precise medical analysis. To address these issues, we propose a novel medical image segmentation method that combines diffusion models and Structure-Preserving Network for structure-aware one-shot image stylization. Our approach effectively mitigates domain shifts by transforming images from various sources into a consistent style while maintaining the location, size, and shape of lesions. This ensures robust and accurate segmentation even when the target domain is absent from the training data. Experimental evaluations on colonoscopy polyp segmentation and skin lesion segmentation datasets show that our method enhances the robustness and accuracy of segmentation models, achieving superior performance metrics compared to baseline models without style transfer. This structure-aware stylization framework offers a practical solution for improving medical image segmentation across diverse domains, facilitating more reliable clinical diagnoses.

Structure-Aware Stylized Image Synthesis for Robust Medical Image Segmentation

TL;DR

This work tackles domain shift in medical image segmentation by combining diffusion-based style transfer with a Structure-Preserving Network (SPN) to produce structure-aware stylizations. The OSASIS framework encodes semantic content via a Diffusion Autoencoder, preserves geometry with SPN, and feeds stylized images into segmentation models, guided by style-transfer and segmentation losses that include CLIP-based adversarial objectives and cycle-consistency. The approach demonstrates improved robustness and accuracy on colonoscopy polyp and skin lesion segmentation tasks across unseen target domains, outperforming baseline models without style transfer. Importantly, OSASIS is compatible with existing segmentation architectures and does not require access to the target domain during training, offering practical value for diverse clinical settings. Future directions include developing direct metrics for segmentation quality after style transfer and enabling end-to-end stylized training for medical image segmentation.

Abstract

Accurate medical image segmentation is essential for effective diagnosis and treatment planning but is often challenged by domain shifts caused by variations in imaging devices, acquisition conditions, and patient-specific attributes. Traditional domain generalization methods typically require inclusion of parts of the test domain within the training set, which is not always feasible in clinical settings with limited diverse data. Additionally, although diffusion models have demonstrated strong capabilities in image generation and style transfer, they often fail to preserve the critical structural information necessary for precise medical analysis. To address these issues, we propose a novel medical image segmentation method that combines diffusion models and Structure-Preserving Network for structure-aware one-shot image stylization. Our approach effectively mitigates domain shifts by transforming images from various sources into a consistent style while maintaining the location, size, and shape of lesions. This ensures robust and accurate segmentation even when the target domain is absent from the training data. Experimental evaluations on colonoscopy polyp segmentation and skin lesion segmentation datasets show that our method enhances the robustness and accuracy of segmentation models, achieving superior performance metrics compared to baseline models without style transfer. This structure-aware stylization framework offers a practical solution for improving medical image segmentation across diverse domains, facilitating more reliable clinical diagnoses.

Paper Structure

This paper contains 27 sections, 12 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Segmentation Challenge in Medical Images under Domain Shift.
  • Figure 2: We selected several representative images from the training set in Polyp Datasets, and above are the results after style transfer. As observed, the transferred images still retain the lesion location information corresponding to the mask.
  • Figure 3: Medical Image Stylization for Segmentation Under Domain Shift.
  • Figure 4: Qualitative results of different models and models after style transfer in Polyp Datasets.
  • Figure 5: Radar charts illustrating the performance metrics (Dice, IoU, Specificity, $F_\beta^w$, $S_\alpha$, $E_\phi^{max}$, and 1-MAE) of UNet, UNet++ and PraNet segmentation models and their style transfer variants on Polyp datasets.
  • ...and 2 more figures