Table of Contents
Fetching ...

Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-supervised Medical Image Segmentation

Meghana Karri, Amit Soni Arya, Koushik Biswas, Nicol`o Gennaro, Vedat Cicek, Gorkem Durak, Yuri S. Velichko, Ulas Bagci

TL;DR

This work tackles semi-supervised medical image segmentation under limited labeled data by introducing UG-CEMT, which fuses a cross-attention ensemble mean teacher with uncertainty-guided consistency regularization and Sharpness-Aware Minimization. The method preserves a shared backbone while enforcing high disparity between sub-networks and uses MC Dropout-derived uncertainty to weight learning signals, implemented in a two-stage training process. Evaluations on 3D left atrium MRI and multi-site prostate MRI show UG-CEMT achieving state-of-the-art performance, approaching fully supervised results with as little as 10% labeled data, and demonstrating strong domain generalization. The approach reduces annotation costs and enhances robustness for clinical deployment, with public code available.

Abstract

This work proposes a novel framework, Uncertainty-Guided Cross Attention Ensemble Mean Teacher (UG-CEMT), for achieving state-of-the-art performance in semi-supervised medical image segmentation. UG-CEMT leverages the strengths of co-training and knowledge distillation by combining a Cross-attention Ensemble Mean Teacher framework (CEMT) inspired by Vision Transformers (ViT) with uncertainty-guided consistency regularization and Sharpness-Aware Minimization emphasizing uncertainty. UG-CEMT improves semi-supervised performance while maintaining a consistent network architecture and task setting by fostering high disparity between sub-networks. Experiments demonstrate significant advantages over existing methods like Mean Teacher and Cross-pseudo Supervision in terms of disparity, domain generalization, and medical image segmentation performance. UG-CEMT achieves state-of-the-art results on multi-center prostate MRI and cardiac MRI datasets, where object segmentation is particularly challenging. Our results show that using only 10\% labeled data, UG-CEMT approaches the performance of fully supervised methods, demonstrating its effectiveness in exploiting unlabeled data for robust medical image segmentation. The code is publicly available at \url{https://github.com/Meghnak13/UG-CEMT}

Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-supervised Medical Image Segmentation

TL;DR

This work tackles semi-supervised medical image segmentation under limited labeled data by introducing UG-CEMT, which fuses a cross-attention ensemble mean teacher with uncertainty-guided consistency regularization and Sharpness-Aware Minimization. The method preserves a shared backbone while enforcing high disparity between sub-networks and uses MC Dropout-derived uncertainty to weight learning signals, implemented in a two-stage training process. Evaluations on 3D left atrium MRI and multi-site prostate MRI show UG-CEMT achieving state-of-the-art performance, approaching fully supervised results with as little as 10% labeled data, and demonstrating strong domain generalization. The approach reduces annotation costs and enhances robustness for clinical deployment, with public code available.

Abstract

This work proposes a novel framework, Uncertainty-Guided Cross Attention Ensemble Mean Teacher (UG-CEMT), for achieving state-of-the-art performance in semi-supervised medical image segmentation. UG-CEMT leverages the strengths of co-training and knowledge distillation by combining a Cross-attention Ensemble Mean Teacher framework (CEMT) inspired by Vision Transformers (ViT) with uncertainty-guided consistency regularization and Sharpness-Aware Minimization emphasizing uncertainty. UG-CEMT improves semi-supervised performance while maintaining a consistent network architecture and task setting by fostering high disparity between sub-networks. Experiments demonstrate significant advantages over existing methods like Mean Teacher and Cross-pseudo Supervision in terms of disparity, domain generalization, and medical image segmentation performance. UG-CEMT achieves state-of-the-art results on multi-center prostate MRI and cardiac MRI datasets, where object segmentation is particularly challenging. Our results show that using only 10\% labeled data, UG-CEMT approaches the performance of fully supervised methods, demonstrating its effectiveness in exploiting unlabeled data for robust medical image segmentation. The code is publicly available at \url{https://github.com/Meghnak13/UG-CEMT}

Paper Structure

This paper contains 10 sections, 7 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Comparison of architectures and their performance for SSL segmentation tasks: (a) Mean Teacher (MT), (b) Cross-Pseudo supervision (CPS), (c) UG-CEMT framework (proposed), (d) disparity between co-training sub-networks w.r.t Jaccard metric, (e) domain generalization effectiveness for multi-site prostate dataset, and (f) segmentation performance on single-site LA dataset.
  • Figure 2: The proposed UG-CEMT architecture. UG-CEMT creates new samples $X^{'}$ from input data using UGM. Cross-Attention (CA) is applied between the student and teacher model, where $O_{(s\rightarrow t)}$ and $O_{(t\rightarrow s)}$ represent outputs of attention mechanism from student to teacher, and teacher to student respectively.
  • Figure 3: Overview of the proposed cross-attention (CA) mechanism (inspired by ViT).
  • Figure 4: Visualization of 3D segmentation outcomes of various SSL methods for $20\%$ labeled data on LA dataset.
  • Figure 5: Visualization of 3D segmentation outcomes of various SSL methods for $20\%$ labeled data on multi-site prostate dataset.
  • ...and 1 more figures