Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video
Hao Li, Daiwei Lu, Xing Yao, Nicholas Kavoussi, Ipek Oguz
TL;DR
This work tackles robust endoscopic image segmentation under limited annotations. It introduces Endo-SemiS, a semi-supervised framework that combines cross-supervision of two U-Nets with uncertainty-guided pseudo-labeling, joint pseudo-label supervision, and multi-level mutual learning, plus a spatiotemporal correction module to exploit video context. The method dynamically filters unreliable regions using aleatoric and epistemic uncertainty, and fuses predictions to produce reliable supervision for unlabeled frames. Experiments on kidney stone lithotripsy and colon polyp screening show Endo-SemiS achieves state-of-the-art performance, often surpassing fully supervised baselines with much less labeled data and demonstrating cross-site generalization. The work offers a practical, real-time-capable approach for robust endoscopic segmentation in resource-constrained settings.
Abstract
In this paper, we present Endo-SemiS, a semi-supervised segmentation framework for providing reliable segmentation of endoscopic video frames with limited annotation. EndoSemiS uses 4 strategies to improve performance by effectively utilizing all available data, particularly unlabeled data: (1) Cross-supervision between two individual networks that supervise each other; (2) Uncertainty-guided pseudo-labels from unlabeled data, which are generated by selecting high-confidence regions to improve their quality; (3) Joint pseudolabel supervision, which aggregates reliable pixels from the pseudo-labels of both networks to provide accurate supervision for unlabeled data; and (4) Mutual learning, where both networks learn from each other at the feature and image levels, reducing variance and guiding them toward a consistent solution. Additionally, a separate corrective network that utilizes spatiotemporal information from endoscopy video to improve segmentation performance. Endo-SemiS is evaluated on two clinical applications: kidney stone laser lithotomy from ureteroscopy and polyp screening from colonoscopy. Compared to state-of-the-art segmentation methods, Endo-SemiS substantially achieves superior results on both datasets with limited labeled data. The code is publicly available at https://github.com/MedICL-VU/Endo-SemiS
