Table of Contents
Fetching ...

OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted Surgery

Long Bai, Guankun Wang, Jie Wang, Xiaoxiao Yang, Huxin Gao, Xin Liang, An Wang, Mobarakol Islam, Hongliang Ren

TL;DR

This work tackles open-set surgical activity recognition in robot-assisted surgery. It introduces OSSAR, a framework that uses HyperSpherical Reciprocal Points (HSRP) to separate known and unknown classes on a hypersphere, coupled with Closed-set Over-confidence Calibration (COC) to reduce misclassification of unknowns as known. The approach is evaluated on two benchmarks, JIGSAWS and the DREAMS endoscopic dataset, showing superior performance against state-of-the-art OSR methods in both known-class accuracy and unknown detection. The results demonstrate improved robustness to unseen surgical activities, with practical implications for safer and more reliable intelligent surgical systems. The authors also provide public code, promoting reproducibility and further research in open-set SAR.

Abstract

In the realm of automated robotic surgery and computer-assisted interventions, understanding robotic surgical activities stands paramount. Existing algorithms dedicated to surgical activity recognition predominantly cater to pre-defined closed-set paradigms, ignoring the challenges of real-world open-set scenarios. Such algorithms often falter in the presence of test samples originating from classes unseen during training phases. To tackle this problem, we introduce an innovative Open-Set Surgical Activity Recognition (OSSAR) framework. Our solution leverages the hyperspherical reciprocal point strategy to enhance the distinction between known and unknown classes in the feature space. Additionally, we address the issue of over-confidence in the closed set by refining model calibration, avoiding misclassification of unknown classes as known ones. To support our assertions, we establish an open-set surgical activity benchmark utilizing the public JIGSAWS dataset. Besides, we also collect a novel dataset on endoscopic submucosal dissection for surgical activity tasks. Extensive comparisons and ablation experiments on these datasets demonstrate the significant outperformance of our method over existing state-of-the-art approaches. Our proposed solution can effectively address the challenges of real-world surgical scenarios. Our code is publicly accessible at https://github.com/longbai1006/OSSAR.

OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted Surgery

TL;DR

This work tackles open-set surgical activity recognition in robot-assisted surgery. It introduces OSSAR, a framework that uses HyperSpherical Reciprocal Points (HSRP) to separate known and unknown classes on a hypersphere, coupled with Closed-set Over-confidence Calibration (COC) to reduce misclassification of unknowns as known. The approach is evaluated on two benchmarks, JIGSAWS and the DREAMS endoscopic dataset, showing superior performance against state-of-the-art OSR methods in both known-class accuracy and unknown detection. The results demonstrate improved robustness to unseen surgical activities, with practical implications for safer and more reliable intelligent surgical systems. The authors also provide public code, promoting reproducibility and further research in open-set SAR.

Abstract

In the realm of automated robotic surgery and computer-assisted interventions, understanding robotic surgical activities stands paramount. Existing algorithms dedicated to surgical activity recognition predominantly cater to pre-defined closed-set paradigms, ignoring the challenges of real-world open-set scenarios. Such algorithms often falter in the presence of test samples originating from classes unseen during training phases. To tackle this problem, we introduce an innovative Open-Set Surgical Activity Recognition (OSSAR) framework. Our solution leverages the hyperspherical reciprocal point strategy to enhance the distinction between known and unknown classes in the feature space. Additionally, we address the issue of over-confidence in the closed set by refining model calibration, avoiding misclassification of unknown classes as known ones. To support our assertions, we establish an open-set surgical activity benchmark utilizing the public JIGSAWS dataset. Besides, we also collect a novel dataset on endoscopic submucosal dissection for surgical activity tasks. Extensive comparisons and ablation experiments on these datasets demonstrate the significant outperformance of our method over existing state-of-the-art approaches. Our proposed solution can effectively address the challenges of real-world surgical scenarios. Our code is publicly accessible at https://github.com/longbai1006/OSSAR.
Paper Structure (21 sections, 9 equations, 3 figures, 6 tables)

This paper contains 21 sections, 9 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Illustration of open-set surgical activity recognition. The gray band depicts the unknown classes, whereas the others represent the known classes. The model is able to predict the pre-defined classes, but if the class is not defined during the training stage, the model cannot discriminate it during testing.
  • Figure 2: Overview of our OSSAR framework. (a) The general procedure of our OSSAR framework is demonstrated, with the HSRP classification loss $\mathcal{L}_{hc}$, the adversarial margin constraint loss $\mathcal{L}_{amc}$, and the closed-set over-confidence calibration loss $\mathcal{L}_{coc}$. (b) The principle of reciprocal points is demonstrated, in which the opposite points of target classes are used to represent the feature space of all known and unknown classes except for the target class. (c) The disordered state of the feature space, when it is unclassified in the hyper-spherical feature space, is presented. (d) The ordinary closed-set classification scenario is shown, where different classes are clustered together. (e) Our hyperspherical reciprocal point solution is presented, which pushes the unknown classes closer to the reciprocal points, resulting in better performance of unknown class detection.
  • Figure 3: Color-coded ribbon illustration for DREAMS dataset. The gray band depicts the unknown classes, whereas the others denote the known classes. The red indicates the unknown samples that the model fails to recognize correctly. The six classes are marking, injection, circumferential incision, subsidized injection, installation and debugging, and bimanual submucosal dissection (with order).