Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-supervised Medical Image Segmentation
Meghana Karri, Amit Soni Arya, Koushik Biswas, Nicol`o Gennaro, Vedat Cicek, Gorkem Durak, Yuri S. Velichko, Ulas Bagci
TL;DR
This work tackles semi-supervised medical image segmentation under limited labeled data by introducing UG-CEMT, which fuses a cross-attention ensemble mean teacher with uncertainty-guided consistency regularization and Sharpness-Aware Minimization. The method preserves a shared backbone while enforcing high disparity between sub-networks and uses MC Dropout-derived uncertainty to weight learning signals, implemented in a two-stage training process. Evaluations on 3D left atrium MRI and multi-site prostate MRI show UG-CEMT achieving state-of-the-art performance, approaching fully supervised results with as little as 10% labeled data, and demonstrating strong domain generalization. The approach reduces annotation costs and enhances robustness for clinical deployment, with public code available.
Abstract
This work proposes a novel framework, Uncertainty-Guided Cross Attention Ensemble Mean Teacher (UG-CEMT), for achieving state-of-the-art performance in semi-supervised medical image segmentation. UG-CEMT leverages the strengths of co-training and knowledge distillation by combining a Cross-attention Ensemble Mean Teacher framework (CEMT) inspired by Vision Transformers (ViT) with uncertainty-guided consistency regularization and Sharpness-Aware Minimization emphasizing uncertainty. UG-CEMT improves semi-supervised performance while maintaining a consistent network architecture and task setting by fostering high disparity between sub-networks. Experiments demonstrate significant advantages over existing methods like Mean Teacher and Cross-pseudo Supervision in terms of disparity, domain generalization, and medical image segmentation performance. UG-CEMT achieves state-of-the-art results on multi-center prostate MRI and cardiac MRI datasets, where object segmentation is particularly challenging. Our results show that using only 10\% labeled data, UG-CEMT approaches the performance of fully supervised methods, demonstrating its effectiveness in exploiting unlabeled data for robust medical image segmentation. The code is publicly available at \url{https://github.com/Meghnak13/UG-CEMT}
