Diff-CL: A Novel Cross Pseudo-Supervision Method for Semi-supervised Medical Image Segmentation

Xiuzhen Guo; Lianyuan Yu; Ji Shi; Na Lei; Hongxiao Wang

Diff-CL: A Novel Cross Pseudo-Supervision Method for Semi-supervised Medical Image Segmentation

Xiuzhen Guo, Lianyuan Yu, Ji Shi, Na Lei, Hongxiao Wang

TL;DR

This work tackles semi-supervised medical image segmentation under limited labels by introducing Diff-CL, a distribution-aware framework that fuses diffusion-based distribution modeling (DS) with detail-oriented CNN segmentation (CS) via cross-pseudo supervision. It adds a 3D high-frequency Mamba module to capture global, high-frequency details efficiently and employs contrastive label propagation to transfer class-semantic information from labeled to unlabeled regions. The method defines dual losses for cross-pseudo supervision, a high-frequency attention mechanism, and a memory-bank–driven contrastive loss, integrating them into a unified semi-supervised objective with $L^{d} = L^{d}_{s} + \mu_1 L^{d}_{p}$ and $L^{c} = L^{c}_{s} + \mu_2 L^{c}_{u}$, where $L^{c}_{u} = L^{c}_{p} + \eta L_{cl}$. Empirically, Diff-CL achieves state-of-the-art performance on left atrium, BraTS brain tumor, and NIH pancreas datasets across low labeling ratios, demonstrating improved generalization and boundary fidelity thanks to the distribution perspective and synergistic model design.

Abstract

Semi-supervised learning utilizes insights from unlabeled data to improve model generalization, thereby reducing reliance on large labeled datasets. Most existing studies focus on limited samples and fail to capture the overall data distribution. We contend that combining distributional information with detailed information is crucial for achieving more robust and accurate segmentation results. On the one hand, with its robust generative capabilities, diffusion models (DM) learn data distribution effectively. However, it struggles with fine detail capture, leading to generated images with misleading details. Combining DM with convolutional neural networks (CNNs) enables the former to learn data distribution while the latter corrects fine details. While capturing complete high-frequency details by CNNs requires substantial computational resources and is susceptible to local noise. On the other hand, given that both labeled and unlabeled data come from the same distribution, we believe that regions in unlabeled data similar to overall class semantics to labeled data are likely to belong to the same class, while regions with minimal similarity are less likely to. This work introduces a semi-supervised medical image segmentation framework from the distribution perspective (Diff-CL). Firstly, we propose a cross-pseudo-supervision learning mechanism between diffusion and convolution segmentation networks. Secondly, we design a high-frequency mamba module to capture boundary and detail information globally. Finally, we apply contrastive learning for label propagation from labeled to unlabeled data. Our method achieves state-of-the-art (SOTA) performance across three datasets, including left atrium, brain tumor, and NIH pancreas datasets.

Diff-CL: A Novel Cross Pseudo-Supervision Method for Semi-supervised Medical Image Segmentation

TL;DR

Abstract

Diff-CL: A Novel Cross Pseudo-Supervision Method for Semi-supervised Medical Image Segmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)