Table of Contents
Fetching ...

Reducing Annotating Load: Active Learning with Synthetic Images in Surgical Instrument Segmentation

Haonan Peng, Shan Lin, Daniel King, Yun-Hsuan Su, Randall A. Bly, Kris S. Moe, Blake Hannaford

TL;DR

The paper tackles the high annotation burden of surgical instrument segmentation in endoscopic video by marrying BALD-based active learning with synthetic copy-and-paste image generation. It generates two types of synthetic images from selected real samples and blends boundaries to reduce artifacts, thereby enriching the training set with minimal real-label effort. Across three datasets, the approach achieves substantial gains at low annotation budgets and remains competitive as budgets grow, while also exploring fusion strength and external-background strategies. The method is practical, generalizable, and openly available, offering a clear path to more efficient MIS segmentation pipelines.

Abstract

Accurate instrument segmentation in endoscopic vision of robot-assisted surgery is challenging due to reflection on the instruments and frequent contacts with tissue. Deep neural networks (DNN) show competitive performance and are in favor in recent years. However, the hunger of DNN for labeled data poses a huge workload of annotation. Motivated by alleviating this workload, we propose a general embeddable method to decrease the usage of labeled real images, using active generated synthetic images. In each active learning iteration, the most informative unlabeled images are first queried by active learning and then labeled. Next, synthetic images are generated based on these selected images. The instruments and backgrounds are cropped out and randomly combined with each other with blending and fusion near the boundary. The effectiveness of the proposed method is validated on 2 sinus surgery datasets and 1 intraabdominal surgery dataset. The results indicate a considerable improvement in performance, especially when the budget for annotation is small. The effectiveness of different types of synthetic images, blending methods, and external background are also studied. All the code is open-sourced at: https://github.com/HaonanPeng/active_syn_generator.

Reducing Annotating Load: Active Learning with Synthetic Images in Surgical Instrument Segmentation

TL;DR

The paper tackles the high annotation burden of surgical instrument segmentation in endoscopic video by marrying BALD-based active learning with synthetic copy-and-paste image generation. It generates two types of synthetic images from selected real samples and blends boundaries to reduce artifacts, thereby enriching the training set with minimal real-label effort. Across three datasets, the approach achieves substantial gains at low annotation budgets and remains competitive as budgets grow, while also exploring fusion strength and external-background strategies. The method is practical, generalizable, and openly available, offering a clear path to more efficient MIS segmentation pipelines.

Abstract

Accurate instrument segmentation in endoscopic vision of robot-assisted surgery is challenging due to reflection on the instruments and frequent contacts with tissue. Deep neural networks (DNN) show competitive performance and are in favor in recent years. However, the hunger of DNN for labeled data poses a huge workload of annotation. Motivated by alleviating this workload, we propose a general embeddable method to decrease the usage of labeled real images, using active generated synthetic images. In each active learning iteration, the most informative unlabeled images are first queried by active learning and then labeled. Next, synthetic images are generated based on these selected images. The instruments and backgrounds are cropped out and randomly combined with each other with blending and fusion near the boundary. The effectiveness of the proposed method is validated on 2 sinus surgery datasets and 1 intraabdominal surgery dataset. The results indicate a considerable improvement in performance, especially when the budget for annotation is small. The effectiveness of different types of synthetic images, blending methods, and external background are also studied. All the code is open-sourced at: https://github.com/HaonanPeng/active_syn_generator.

Paper Structure

This paper contains 18 sections, 18 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Workflow of the system
  • Figure 2: Generation of synthetic image. Please notice that $M_v$ is only for visualization, where the solid green line indicates the outline of the instrument, yellow area comes from the background image $I_B$ and blue area comes from the instrument image $I_{IC}$. Transition area can be found around the boundary of the instrument on the synthetic image.
  • Figure 3: Multi-blending: images on the left and right have the same size and position of instrument and background, but have different blending method (average fusion on the left and Gaussian fusion on the right) and parameters.
  • Figure 4: Example of images from UW-Sinus-Surgery-C/L Dataset dataset, where the outlooks of the instrument are different due to reflection.
  • Figure 5: Original real images (left), Type-1 synthetic images (center) and Type-2 synthetic images (right). Type-1 has the same instrument but a different background, and type-2 has the same background but a different instrument.
  • ...and 5 more figures