Table of Contents
Fetching ...

Style-Aware Blending and Prototype-Based Cross-Contrast Consistency for Semi-Supervised Medical Image Segmentation

Chaowei Chen, Xiang Zhang, Honglie Guo, Shunfang Wang

TL;DR

The paper tackles semi-supervised medical image segmentation by addressing distribution mismatch and underutilized supervision between labeled and unlabeled data. It introduces a two-component framework: style-guided distribution blending to align domain statistics between labeled and unlabeled images, and a prototype-based cross-contrast mechanism that uses confidence-guided prototypes and a memory bank to enable bidirectional weak-strong supervision. The method integrates these modules into a Mean Teacher framework with a labeled branch trained on style-blended data and an unlabeled branch optimized with both pixel-wise consistency and cross-view prototype contrast, achieving state-of-the-art results on Synapse and ACDC under 5% and 10% labeling with extensive ablations validating each component. This work advances semi-supervised medical image segmentation by improving distribution alignment and exploiting richer supervisory signals, offering practical benefits for annotation-efficient clinical pipelines.

Abstract

Weak-strong consistency learning strategies are widely employed in semi-supervised medical image segmentation to train models by leveraging limited labeled data and enforcing weak-to-strong consistency. However, existing methods primarily focus on designing and combining various perturbation schemes, overlooking the inherent potential and limitations within the framework itself. In this paper, we first identify two critical deficiencies: (1) separated training data streams, which lead to confirmation bias dominated by the labeled stream; and (2) incomplete utilization of supervisory information, which limits exploration of strong-to-weak consistency. To tackle these challenges, we propose a style-aware blending and prototype-based cross-contrast consistency learning framework. Specifically, inspired by the empirical observation that the distribution mismatch between labeled and unlabeled data can be characterized by statistical moments, we design a style-guided distribution blending module to break the independent training data streams. Meanwhile, considering the potential noise in strong pseudo-labels, we introduce a prototype-based cross-contrast strategy to encourage the model to learn informative supervisory signals from both weak-to-strong and strong-to-weak predictions, while mitigating the adverse effects of noise. Experimental results demonstrate the effectiveness and superiority of our framework across multiple medical segmentation benchmarks under various semi-supervised settings.

Style-Aware Blending and Prototype-Based Cross-Contrast Consistency for Semi-Supervised Medical Image Segmentation

TL;DR

The paper tackles semi-supervised medical image segmentation by addressing distribution mismatch and underutilized supervision between labeled and unlabeled data. It introduces a two-component framework: style-guided distribution blending to align domain statistics between labeled and unlabeled images, and a prototype-based cross-contrast mechanism that uses confidence-guided prototypes and a memory bank to enable bidirectional weak-strong supervision. The method integrates these modules into a Mean Teacher framework with a labeled branch trained on style-blended data and an unlabeled branch optimized with both pixel-wise consistency and cross-view prototype contrast, achieving state-of-the-art results on Synapse and ACDC under 5% and 10% labeling with extensive ablations validating each component. This work advances semi-supervised medical image segmentation by improving distribution alignment and exploiting richer supervisory signals, offering practical benefits for annotation-efficient clinical pipelines.

Abstract

Weak-strong consistency learning strategies are widely employed in semi-supervised medical image segmentation to train models by leveraging limited labeled data and enforcing weak-to-strong consistency. However, existing methods primarily focus on designing and combining various perturbation schemes, overlooking the inherent potential and limitations within the framework itself. In this paper, we first identify two critical deficiencies: (1) separated training data streams, which lead to confirmation bias dominated by the labeled stream; and (2) incomplete utilization of supervisory information, which limits exploration of strong-to-weak consistency. To tackle these challenges, we propose a style-aware blending and prototype-based cross-contrast consistency learning framework. Specifically, inspired by the empirical observation that the distribution mismatch between labeled and unlabeled data can be characterized by statistical moments, we design a style-guided distribution blending module to break the independent training data streams. Meanwhile, considering the potential noise in strong pseudo-labels, we introduce a prototype-based cross-contrast strategy to encourage the model to learn informative supervisory signals from both weak-to-strong and strong-to-weak predictions, while mitigating the adverse effects of noise. Experimental results demonstrate the effectiveness and superiority of our framework across multiple medical segmentation benchmarks under various semi-supervised settings.

Paper Structure

This paper contains 13 sections, 10 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: (a)–(b) show the previous (e.g., UA-MT yu2019uncertainty) and proposed architectures, and (c)–(d) compare the first- and second-order moments of pixel intensities in the Synapse dataset landman2015miccai under different data splits (c) and across categories (d).
  • Figure 2: Illustration of our framework, comprising two main components: a style-guided distribution blending module (solid box) and a dual-branch architecture (dashed box) with labeled and unlabeled branches. The labeled branch is trained on style-blended labeled data, while the unlabeled branch enforces both pixel-wise weak-strong consistency and prototype-based cross-contrast consistency. Here, $X_m^l$ and $Y^l$ denote the style-blended labeled image and its corresponding label; $X^u$ and $Y^w$ represent the unlabeled image and its pseudo-label; $A_w$ and $A_s$ indicate weak and strong augmentations, respectively.
  • Figure 3: Visual segmentation results of different methods on ACDC (top two rows) and Synapse (bottom two rows). Our method achieves a better balance between under-segmentation and over-segmentation.