Table of Contents
Fetching ...

Balancing Multi-Target Semi-Supervised Medical Image Segmentation with Collaborative Generalist and Specialists

You Wang, Zekun Li, Lei Qi, Qian Yu, Yinghuan Shi, Yang Gao

TL;DR

This work tackles scale imbalance in multi-target semi-supervised medical image segmentation by proposing a Collaborative Generalist and Specialists (CGS) framework that jointly trains a generalist head for all targets and K class-specific specialists. It introduces cross-branch consistency losses and an Inter-Head Error Detection (IHED) module to improve pseudo-label quality while keeping inference memory and parameters near those of a standard UNet. The approach yields state-of-the-art results on ACDC, SegTHOR, and Synapse with limited labeled data and demonstrates robustness across datasets and backbones, while remaining compatible as a plug-in for other SSL methods. The method offers practical impact by enabling more accurate multi-organ segmentation in settings with scarce annotations and by providing a modular, inference-efficient strategy for scalable semi-supervised learning in medical imaging.

Abstract

Despite the promising performance achieved by current semi-supervised models in segmenting individual medical targets, many of these models suffer a notable decrease in performance when tasked with the simultaneous segmentation of multiple targets. A vital factor could be attributed to the imbalanced scales among different targets: during simultaneously segmenting multiple targets, large targets dominate the loss, leading to small targets being misclassified as larger ones. To this end, we propose a novel method, which consists of a Collaborative Generalist and several Specialists, termed CGS. It is centered around the idea of employing a specialist for each target class, thus avoiding the dominance of larger targets. The generalist performs conventional multi-target segmentation, while each specialist is dedicated to distinguishing a specific target class from the remaining target classes and the background. Based on a theoretical insight, we demonstrate that CGS can achieve a more balanced training. Moreover, we develop cross-consistency losses to foster collaborative learning between the generalist and the specialists. Lastly, regarding their intrinsic relation that the target class of any specialized head should belong to the remaining classes of the other heads, we introduce an inter-head error detection module to further enhance the quality of pseudo-labels. Experimental results on three popular benchmarks showcase its superior performance compared to state-of-the-art methods.

Balancing Multi-Target Semi-Supervised Medical Image Segmentation with Collaborative Generalist and Specialists

TL;DR

This work tackles scale imbalance in multi-target semi-supervised medical image segmentation by proposing a Collaborative Generalist and Specialists (CGS) framework that jointly trains a generalist head for all targets and K class-specific specialists. It introduces cross-branch consistency losses and an Inter-Head Error Detection (IHED) module to improve pseudo-label quality while keeping inference memory and parameters near those of a standard UNet. The approach yields state-of-the-art results on ACDC, SegTHOR, and Synapse with limited labeled data and demonstrates robustness across datasets and backbones, while remaining compatible as a plug-in for other SSL methods. The method offers practical impact by enabling more accurate multi-organ segmentation in settings with scarce annotations and by providing a modular, inference-efficient strategy for scalable semi-supervised learning in medical imaging.

Abstract

Despite the promising performance achieved by current semi-supervised models in segmenting individual medical targets, many of these models suffer a notable decrease in performance when tasked with the simultaneous segmentation of multiple targets. A vital factor could be attributed to the imbalanced scales among different targets: during simultaneously segmenting multiple targets, large targets dominate the loss, leading to small targets being misclassified as larger ones. To this end, we propose a novel method, which consists of a Collaborative Generalist and several Specialists, termed CGS. It is centered around the idea of employing a specialist for each target class, thus avoiding the dominance of larger targets. The generalist performs conventional multi-target segmentation, while each specialist is dedicated to distinguishing a specific target class from the remaining target classes and the background. Based on a theoretical insight, we demonstrate that CGS can achieve a more balanced training. Moreover, we develop cross-consistency losses to foster collaborative learning between the generalist and the specialists. Lastly, regarding their intrinsic relation that the target class of any specialized head should belong to the remaining classes of the other heads, we introduce an inter-head error detection module to further enhance the quality of pseudo-labels. Experimental results on three popular benchmarks showcase its superior performance compared to state-of-the-art methods.

Paper Structure

This paper contains 16 sections, 25 equations, 10 figures, 15 tables.

Figures (10)

  • Figure 1: The left figure illustrates the occurrence imbalance in the VOC dataset, where the frequency of class occurrences is uneven. The right figure shows the scale imbalance in the SegTHOR dataset, where the occurrence between different class is relatively balanced.
  • Figure 2: Green, red, and blue respectively represent the target classes of three segmentation heads, while purple represents the remaining classes. Taking red as an example, the red region in the middle image should belong to the purple regions in the two side images. The remaining class consists of several target classes. Best version in color.
  • Figure 3: The pipeline of our proposed CGS (illustrated using ACDC as an example). The framework comprises the generalist as the general branch and the specialists as the multi-head specialized branch. Building upon this, the cross-branch consistency losses are computed and the inter-head error detection module is achieved. $P$ represents predictions from the general branch for unlabeled data $X^u$, while $Q$ denotes the predictions from the specialized branch. $B_k$ is the pseudo-label from the general branch for the specialized branch, while $B$ is the pseudo-label from the specialized branch for the general branch. Additionally, $M_d$ is employed to represent the error detection matrix.
  • Figure 4: The left figure shows the training participation proportions of different classes using the conventional training method, while the right figure illustrates the corresponding proportions in our method.
  • Figure 5: Visualization of segmentation results on ACDC. Our method achieves segmentation results that most closely match the ground truth.
  • ...and 5 more figures