Parsimonious Dataset Construction for Laparoscopic Cholecystectomy Structure Segmentation
Yuning Zhou, Henry Badgery, Matthew Read, James Bailey, Catherine Davey
TL;DR
The paper addresses the high labeling burden in laparoscopic cholecystectomy (LC) structure segmentation by introducing an active learning framework that selects informative frames from surgical vids. It combines uncertainty (prediction entropy) and diversity (deep feature distance) acquisition strategies to build a high-quality, cost-effective dataset, achieving near full-data performance with roughly half of the labeled frames. Experiments on 5 LC videos show that diversity-based frame selection (Euclidean and Cosine distances) outperforms random sampling and can recover most of the full-data IoU on critical anatomies and instruments, illustrating the method’s potential to accelerate surgical AI development while reducing annotation effort. The approach meaningfully contributes to surgical video dataset construction by enabling efficient, scalable dataset curation and improved generalization with fewer labeled examples.
Abstract
Labeling has always been expensive in the medical context, which has hindered related deep learning application. Our work introduces active learning in surgical video frame selection to construct a high-quality, affordable Laparoscopic Cholecystectomy dataset for semantic segmentation. Active learning allows the Deep Neural Networks (DNNs) learning pipeline to include the dataset construction workflow, which means DNNs trained by existing dataset will identify the most informative data from the newly collected data. At the same time, DNNs' performance and generalization ability improve over time when the newly selected and annotated data are included in the training data. We assessed different data informativeness measurements and found the deep features distances select the most informative data in this task. Our experiments show that with half of the data selected by active learning, the DNNs achieve almost the same performance with 0.4349 mean Intersection over Union (mIoU) compared to the same DNNs trained on the full dataset (0.4374 mIoU) on the critical anatomies and surgical instruments.
