Table of Contents
Fetching ...

Semi-supervised Medical Image Segmentation via Query Distribution Consistency

Rong Wu, Dehua Li, Cong Zhang

TL;DR

Addresses semi-supervised medical image segmentation under scarce annotations by enabling labeled data to guide information extraction from unlabeled data through a mutual learning framework. It proposes Dual-KMax UX-Net, which fuses a 3D UX-Net backbone with a kMeans-based cross-attention module and a Dual-Contrastive Loss that enforces consistency between the query distribution and segmentation outputs via $\mathcal{L}_{segc}$ and $\mathcal{L}_{qdc}$. Key contributions include triple-class segmentation (background, organ, tumor) with cluster-center distance updates and competitive performance on the Left Atrial dataset, notably at 10% and 20% labeled data. The approach, with code available, demonstrates practical potential for accurate segmentation under scarce annotations and informs future design of label-efficient medical image analysis.

Abstract

Semi-supervised learning is increasingly popular in medical image segmentation due to its ability to leverage large amounts of unlabeled data to extract additional information. However, most existing semi-supervised segmentation methods focus only on extracting information from unlabeled data. In this paper, we propose a novel Dual KMax UX-Net framework that leverages labeled data to guide the extraction of information from unlabeled data. Our approach is based on a mutual learning strategy that incorporates two modules: 3D UX-Net as our backbone meta-architecture and KMax decoder to enhance the segmentation performance. Extensive experiments on the Atrial Segmentation Challenge dataset have shown that our method can significantly improve performance by merging unlabeled data. Meanwhile, our framework outperforms state-of-the-art semi-supervised learning methods on 10\% and 20\% labeled settings. Code located at: https://github.com/Rows21/DK-UXNet.

Semi-supervised Medical Image Segmentation via Query Distribution Consistency

TL;DR

Addresses semi-supervised medical image segmentation under scarce annotations by enabling labeled data to guide information extraction from unlabeled data through a mutual learning framework. It proposes Dual-KMax UX-Net, which fuses a 3D UX-Net backbone with a kMeans-based cross-attention module and a Dual-Contrastive Loss that enforces consistency between the query distribution and segmentation outputs via and . Key contributions include triple-class segmentation (background, organ, tumor) with cluster-center distance updates and competitive performance on the Left Atrial dataset, notably at 10% and 20% labeled data. The approach, with code available, demonstrates practical potential for accurate segmentation under scarce annotations and informs future design of label-efficient medical image analysis.

Abstract

Semi-supervised learning is increasingly popular in medical image segmentation due to its ability to leverage large amounts of unlabeled data to extract additional information. However, most existing semi-supervised segmentation methods focus only on extracting information from unlabeled data. In this paper, we propose a novel Dual KMax UX-Net framework that leverages labeled data to guide the extraction of information from unlabeled data. Our approach is based on a mutual learning strategy that incorporates two modules: 3D UX-Net as our backbone meta-architecture and KMax decoder to enhance the segmentation performance. Extensive experiments on the Atrial Segmentation Challenge dataset have shown that our method can significantly improve performance by merging unlabeled data. Meanwhile, our framework outperforms state-of-the-art semi-supervised learning methods on 10\% and 20\% labeled settings. Code located at: https://github.com/Rows21/DK-UXNet.
Paper Structure (15 sections, 5 equations, 4 figures, 2 tables)

This paper contains 15 sections, 5 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overall workflow of our proposed method. Our proposed dual KMax-based contrastive learning strategy (details can be found in Section \ref{['unet']} and Section \ref{['kmean']}).
  • Figure 2: The meta-architecture of the backbone network consists of three components: ConvNeXt encoder, CNN-based decoder, and kMaX decoder.
  • Figure 3: An illustration of kMaX UX-Net.
  • Figure 4: 3D Visualization of different ablation studies for LA segmentation. GT: ground truth. (best viewed in color)