Table of Contents
Fetching ...

Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Lei Qi, Qian Yu, Yinghuan Shi, Yang Gao

TL;DR

MiDSS tackles mixed-domain semi-supervised medical image segmentation by introducing intermediate domains through Unified Copy-Paste, symmetric guidance to exploit intermediate-domain information, and progressive style-transition augmentation via TP-RAM. The approach narrows domain gaps, improves pseudo-label reliability, and stabilizes training under domain shift, achieving significant Dice-score gains (notably 13.57% on Prostate). Extensive experiments across Fundus, Prostate, and M&Ms datasets show strong, robust performance against both SSMS and UDA baselines, with results approaching upper-bound scenarios when labeled data is plentiful. The publicly available code supports practical deployment and evaluation in real-world multi-center medical imaging settings.

Abstract

Both limited annotation and domain shift are prevalent challenges in medical image segmentation. Traditional semi-supervised segmentation and unsupervised domain adaptation methods address one of these issues separately. However, the coexistence of limited annotation and domain shift is quite common, which motivates us to introduce a novel and challenging scenario: Mixed Domain Semi-supervised medical image Segmentation (MiDSS). In this scenario, we handle data from multiple medical centers, with limited annotations available for a single domain and a large amount of unlabeled data from multiple domains. We found that the key to solving the problem lies in how to generate reliable pseudo labels for the unlabeled data in the presence of domain shift with labeled data. To tackle this issue, we employ Unified Copy-Paste (UCP) between images to construct intermediate domains, facilitating the knowledge transfer from the domain of labeled data to the domains of unlabeled data. To fully utilize the information within the intermediate domain, we propose a symmetric Guidance training strategy (SymGD), which additionally offers direct guidance to unlabeled data by merging pseudo labels from intermediate samples. Subsequently, we introduce a Training Process aware Random Amplitude MixUp (TP-RAM) to progressively incorporate style-transition components into intermediate samples. Compared with existing state-of-the-art approaches, our method achieves a notable 13.57% improvement in Dice score on Prostate dataset, as demonstrated on three public datasets. Our code is available at https://github.com/MQinghe/MiDSS .

Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

TL;DR

MiDSS tackles mixed-domain semi-supervised medical image segmentation by introducing intermediate domains through Unified Copy-Paste, symmetric guidance to exploit intermediate-domain information, and progressive style-transition augmentation via TP-RAM. The approach narrows domain gaps, improves pseudo-label reliability, and stabilizes training under domain shift, achieving significant Dice-score gains (notably 13.57% on Prostate). Extensive experiments across Fundus, Prostate, and M&Ms datasets show strong, robust performance against both SSMS and UDA baselines, with results approaching upper-bound scenarios when labeled data is plentiful. The publicly available code supports practical deployment and evaluation in real-world multi-center medical imaging settings.

Abstract

Both limited annotation and domain shift are prevalent challenges in medical image segmentation. Traditional semi-supervised segmentation and unsupervised domain adaptation methods address one of these issues separately. However, the coexistence of limited annotation and domain shift is quite common, which motivates us to introduce a novel and challenging scenario: Mixed Domain Semi-supervised medical image Segmentation (MiDSS). In this scenario, we handle data from multiple medical centers, with limited annotations available for a single domain and a large amount of unlabeled data from multiple domains. We found that the key to solving the problem lies in how to generate reliable pseudo labels for the unlabeled data in the presence of domain shift with labeled data. To tackle this issue, we employ Unified Copy-Paste (UCP) between images to construct intermediate domains, facilitating the knowledge transfer from the domain of labeled data to the domains of unlabeled data. To fully utilize the information within the intermediate domain, we propose a symmetric Guidance training strategy (SymGD), which additionally offers direct guidance to unlabeled data by merging pseudo labels from intermediate samples. Subsequently, we introduce a Training Process aware Random Amplitude MixUp (TP-RAM) to progressively incorporate style-transition components into intermediate samples. Compared with existing state-of-the-art approaches, our method achieves a notable 13.57% improvement in Dice score on Prostate dataset, as demonstrated on three public datasets. Our code is available at https://github.com/MQinghe/MiDSS .
Paper Structure (14 sections, 11 equations, 6 figures, 5 tables)

This paper contains 14 sections, 11 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The upper figure illustrates SSMS, UDA, and MiDSS. The lower figure shows the comparison between different methods on the labeled domain (BIDMC) liu2020shape and other domains.
  • Figure 2: The overall framework of our method emphasizes domain knowledge transfer through data augmentation and training strategy. We generate intermediate samples through UCP between labeled data and unlabeled data. During training, we gradually introduce style transfer components to the intermediate samples, constructing intermediate domains at both semantic and stylistic levels. Further details about UCP and TP-RAM are provided in \ref{['ucp']} and \ref{['tpram']}. Moreover, we design a symmetric guidance for model training. In addition to guiding from unlabeled data to intermediate samples, we merge unlabeled regions of intermediate samples to obtain the pseudo label of unlabeled data from another perspective. The integration of pseudo labels through ensemble from two perspectives guides the prediction of unlabeled data. Best viewed in color.
  • Figure 3: The illustration of UCP between images. Cut refers to splitting the image into two parts according to $M_{\alpha}$, while Mix implies merging two parts of images back together.
  • Figure 4: The results depict the quality of pseudo labels in a SSMS method (FixMatch) with and without UCP, utilizing 40 labeled data from the BIDMC domain in Prostate dataset liu2020shape. Each colored bar represents samples from different domains, with the bar height indicating the quality of pseudo labels generated by the model for unlabeled data from that domain.
  • Figure 5: The illustration of TP-RAM. Fast Fourier transform (FFT) extracts the amplitude and phase maps of $x^w$ and $y^w$. The low-frequency regions (determined by $M_\beta$) of the two phase maps are mixed. Through inverse Fast Fourier transform (IFFT), $x^u$ is synthesized, preserving consistent semantics while introducing a different style compared to $x^w$.
  • ...and 1 more figures