Table of Contents
Fetching ...

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Zekun Li, Lei Qi, Qian Yu, Yinghuan Shi

TL;DR

This work tackles the problem of domain shift and limited annotations in mixed-domain semi-supervised medical image segmentation by introducing SynFoC, a synergistic framework that jointly trains a foundation model (MedSAM) and a conventional model (U-Net). It dynamically balances their influence via Self-Mutual Confidence to produce high-quality pseudo-labels and employs Consensus-Divergence Consistency Regularization to enforce reliable convergence and representation alignment. Across four public multi-domain datasets, SynFoC delivers substantial gains, including a 10.31% Dice improvement on Prostate with just 20 labeled samples, demonstrating robust performance under diverse domain shifts. The approach offers a generalizable mechanism to leverage foundation models alongside traditional architectures in challenging medical imaging scenarios, with potential applicability to other foundation-model combinations.

Abstract

Large pretrained visual foundation models exhibit impressive general capabilities. However, the extensive prior knowledge inherent in these models can sometimes be a double-edged sword when adapting them to downstream tasks in specific domains. In the context of semi-supervised medical image segmentation with domain shift, foundation models like MedSAM tend to make overconfident predictions, some of which are incorrect. The error accumulation hinders the effective utilization of unlabeled data and limits further improvements. In this paper, we introduce a Synergistic training framework for Foundation and Conventional models (SynFoC) to address the issue. We observe that a conventional model trained from scratch has the ability to correct the high-confidence mispredictions of the foundation model, while the foundation model can supervise it with high-quality pseudo-labels in the early training stages. Furthermore, to enhance the collaborative training effectiveness of both models and promote reliable convergence towards optimization, the consensus-divergence consistency regularization is proposed. We demonstrate the superiority of our method across four public multi-domain datasets. In particular, our method improves the Dice score by 10.31\% on the Prostate dataset. Our code is available at https://github.com/MQinghe/SynFoC .

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation

TL;DR

This work tackles the problem of domain shift and limited annotations in mixed-domain semi-supervised medical image segmentation by introducing SynFoC, a synergistic framework that jointly trains a foundation model (MedSAM) and a conventional model (U-Net). It dynamically balances their influence via Self-Mutual Confidence to produce high-quality pseudo-labels and employs Consensus-Divergence Consistency Regularization to enforce reliable convergence and representation alignment. Across four public multi-domain datasets, SynFoC delivers substantial gains, including a 10.31% Dice improvement on Prostate with just 20 labeled samples, demonstrating robust performance under diverse domain shifts. The approach offers a generalizable mechanism to leverage foundation models alongside traditional architectures in challenging medical imaging scenarios, with potential applicability to other foundation-model combinations.

Abstract

Large pretrained visual foundation models exhibit impressive general capabilities. However, the extensive prior knowledge inherent in these models can sometimes be a double-edged sword when adapting them to downstream tasks in specific domains. In the context of semi-supervised medical image segmentation with domain shift, foundation models like MedSAM tend to make overconfident predictions, some of which are incorrect. The error accumulation hinders the effective utilization of unlabeled data and limits further improvements. In this paper, we introduce a Synergistic training framework for Foundation and Conventional models (SynFoC) to address the issue. We observe that a conventional model trained from scratch has the ability to correct the high-confidence mispredictions of the foundation model, while the foundation model can supervise it with high-quality pseudo-labels in the early training stages. Furthermore, to enhance the collaborative training effectiveness of both models and promote reliable convergence towards optimization, the consensus-divergence consistency regularization is proposed. We demonstrate the superiority of our method across four public multi-domain datasets. In particular, our method improves the Dice score by 10.31\% on the Prostate dataset. Our code is available at https://github.com/MQinghe/SynFoC .

Paper Structure

This paper contains 22 sections, 12 equations, 12 figures, 14 tables.

Figures (12)

  • Figure 1: Illustration of pseudo-labels generation across training stages for various methods. In experiment (a) and (b) on prostate dataset, 20 labeled data come from BIDMC and HK, respectively. We report the Dice Coefficient between each pseudo-label and the ground truth, which is represented by the green contour. Standalone training of U-Net and MedSAM, as well as the guidance from MedSAM to U-Net, fail to effectively address MiDSS.
  • Figure 2: The overall framework of our SynFoC. For U-Net and MedSAM, the teacher model generates pseudo-labels for intermediate samples to guiding the student model. To reduce computational costs, we applies the LoRA module to MedSAM. We design various pseudo-label integration strategies to combine the predictions of both models, aiming to achieve higher-quality pseudo-labels. Additionally, we introduce consensus-divergence consistency regularization to enhance the efficiency of the synergistic training.
  • Figure 3: The performance comparison of U-Net and MedSAM under standalone and SMC-based synergistic training on Prostate.
  • Figure 4: Illustration of self-confidence and mutual confidence evaluation and Consensus-Divergence consistency regularization.
  • Figure 5: Visual comparison of different methods on Prostate dataset. The test samples are drawn from the labeled domain (HK) and another domain (RUNMC), respectively.
  • ...and 7 more figures