Table of Contents
Fetching ...

Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation for Semi-Supervised Medical Image Segmentation

Yuanbin Chen, Tao Wang, Hui Tang, Longxuan Zhao, Ruige Zong, Shun Chen, Tao Tan, Xinlin Zhang, Tong Tong

TL;DR

To tackle limited labeled data in medical image segmentation, the authors propose DCPA, which combines pseudo-label guided data augmentation with dual-decoder consistency under a mean-teacher framework. The method uses a shared encoder, two decoders with different upsampling paths, and EMA to produce stable teacher predictions, while unlabeled data are augmented and mixed with labeled data using pseudo-labels. A sharpening step and a three-term loss $L_{total} = L_{sup} + L_{unsup} + \lambda L_{con}$ guide cross-decoder consistency and supervision from both ground-truth and pseudo-labels. Empirical results on Pancreas-CT, LA, and ACDC show large gains with as little as 5% labeled data, often matching or surpassing fully supervised baselines, with publicly available code.

Abstract

While supervised learning has achieved remarkable success, obtaining large-scale labeled datasets in biomedical imaging is often impractical due to high costs and the time-consuming annotations required from radiologists. Semi-supervised learning emerges as an effective strategy to overcome this limitation by leveraging useful information from unlabeled datasets. In this paper, we present a novel semi-supervised learning method, Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation (DCPA), for medical image segmentation. We devise a consistency regularization to promote consistent representations during the training process. Specifically, we use distinct decoders for student and teacher networks while maintain the same encoder. Moreover, to learn from unlabeled data, we create pseudo-labels generated by the teacher networks and augment the training data with the pseudo-labels. Both techniques contribute to enhancing the performance of the proposed method. The method is evaluated on three representative medical image segmentation datasets. Comprehensive comparisons with state-of-the-art semi-supervised medical image segmentation methods were conducted under typical scenarios, utilizing 10% and 20% labeled data, as well as in the extreme scenario of only 5% labeled data. The experimental results consistently demonstrate the superior performance of our method compared to other methods across the three semi-supervised settings. The source code is publicly available at https://github.com/BinYCn/DCPA.git.

Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation for Semi-Supervised Medical Image Segmentation

TL;DR

To tackle limited labeled data in medical image segmentation, the authors propose DCPA, which combines pseudo-label guided data augmentation with dual-decoder consistency under a mean-teacher framework. The method uses a shared encoder, two decoders with different upsampling paths, and EMA to produce stable teacher predictions, while unlabeled data are augmented and mixed with labeled data using pseudo-labels. A sharpening step and a three-term loss guide cross-decoder consistency and supervision from both ground-truth and pseudo-labels. Empirical results on Pancreas-CT, LA, and ACDC show large gains with as little as 5% labeled data, often matching or surpassing fully supervised baselines, with publicly available code.

Abstract

While supervised learning has achieved remarkable success, obtaining large-scale labeled datasets in biomedical imaging is often impractical due to high costs and the time-consuming annotations required from radiologists. Semi-supervised learning emerges as an effective strategy to overcome this limitation by leveraging useful information from unlabeled datasets. In this paper, we present a novel semi-supervised learning method, Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation (DCPA), for medical image segmentation. We devise a consistency regularization to promote consistent representations during the training process. Specifically, we use distinct decoders for student and teacher networks while maintain the same encoder. Moreover, to learn from unlabeled data, we create pseudo-labels generated by the teacher networks and augment the training data with the pseudo-labels. Both techniques contribute to enhancing the performance of the proposed method. The method is evaluated on three representative medical image segmentation datasets. Comprehensive comparisons with state-of-the-art semi-supervised medical image segmentation methods were conducted under typical scenarios, utilizing 10% and 20% labeled data, as well as in the extreme scenario of only 5% labeled data. The experimental results consistently demonstrate the superior performance of our method compared to other methods across the three semi-supervised settings. The source code is publicly available at https://github.com/BinYCn/DCPA.git.
Paper Structure (34 sections, 13 equations, 9 figures, 4 tables)

This paper contains 34 sections, 13 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Overview of the proposed DCPA method. Detailed descriptions of both the teacher model and the student model are provided beneath the figure. Notably, Decoder 1 and Decoder 2 correspond to two distinct decoders employing different upsampling strategies. The circular annotations of $L_{con}$ indicate the computation of consistency loss between Pred 1 and Pred* 2, as well as between Pred 2 and Pred* 1. EMA stands for exponential moving average.
  • Figure 2: Illustration of the effects of Mixup. The red rectangles within the image highlight areas where noticeable alterations are observed before and after applying Mixup.
  • Figure 3: Demonstration of results after post-sharpening. The red rectangles spotlight regions with pronounced alterations pre and post-sharpening.
  • Figure 4: 2D and 3D views of the segmentation results on the Pancreas-CT dataset. For better visualization, we delineate the segmentation outline of the ground truth (green) and overlay it over the predictions.
  • Figure 5: 3D views of the segmentation results by different methods on the LA dataset. Note that $5$% labeled data were used for training.
  • ...and 4 more figures