Table of Contents
Fetching ...

Expanded Comprehensive Robotic Cholecystectomy Dataset (CRCD)

Ki-Hwan Oh, Leonardo Borgioli, Alberto Mangano, Valentina Valle, Marco Di Pangrazio, Francesco Toti, Gioia Pozza, Luciano Ambrosini, Alvaro Ducas, Miloš Žefran, Liaohai Chen, Pier Cristoforo Giulianotti

TL;DR

The paper presents Expanded Comprehensive Robotic Cholecystectomy Dataset (CRCD), an ex vivo porcine-liver dataset that unifies full kinematic data, pedal signals, time-stamped stereo videos, and comprehensive segmentation and keypoint annotations. It introduces two CV-focused annotation streams in COCO format and demonstrates utility through pedal intent prediction, instance segmentation, keypoint detection, and 3D scene reconstruction, using multiple models including Detectron2 and MaskDINO. The work highlights the dataset’s capacity to support surgeon-skill analysis, context-aware assistance, and automation of robotic tasks, while acknowledging ex vivo workspace limitations compared to in vivo. By providing extensive surgeon profiles and synchronized multi-modal signals, CRCD enables nuanced model training and evaluation for advancing automation in robotic-assisted surgery.

Abstract

In recent years, the application of machine learning to minimally invasive surgery (MIS) has attracted considerable interest. Datasets are critical to the use of such techniques. This paper presents a unique dataset recorded during ex vivo pseudo-cholecystectomy procedures on pig livers using the da Vinci Research Kit (dVRK). Unlike existing datasets, it addresses a critical gap by providing comprehensive kinematic data, recordings of all pedal inputs, and offers a time-stamped record of the endoscope's movements. This expanded version also includes segmentation and keypoint annotations of images, enhancing its utility for computer vision applications. Contributed by seven surgeons with varied backgrounds and experience levels that are provided as a part of this expanded version, the dataset is an important new resource for surgical robotics research. It enables the development of advanced methods for evaluating surgeon skills, tools for providing better context awareness, and automation of surgical tasks. Our work overcomes the limitations of incomplete recordings and imprecise kinematic data found in other datasets. To demonstrate the potential of the dataset for advancing automation in surgical robotics, we introduce two models that predict clutch usage and camera activation, a 3D scene reconstruction example, and the results from our keypoint and segmentation models.

Expanded Comprehensive Robotic Cholecystectomy Dataset (CRCD)

TL;DR

The paper presents Expanded Comprehensive Robotic Cholecystectomy Dataset (CRCD), an ex vivo porcine-liver dataset that unifies full kinematic data, pedal signals, time-stamped stereo videos, and comprehensive segmentation and keypoint annotations. It introduces two CV-focused annotation streams in COCO format and demonstrates utility through pedal intent prediction, instance segmentation, keypoint detection, and 3D scene reconstruction, using multiple models including Detectron2 and MaskDINO. The work highlights the dataset’s capacity to support surgeon-skill analysis, context-aware assistance, and automation of robotic tasks, while acknowledging ex vivo workspace limitations compared to in vivo. By providing extensive surgeon profiles and synchronized multi-modal signals, CRCD enables nuanced model training and evaluation for advancing automation in robotic-assisted surgery.

Abstract

In recent years, the application of machine learning to minimally invasive surgery (MIS) has attracted considerable interest. Datasets are critical to the use of such techniques. This paper presents a unique dataset recorded during ex vivo pseudo-cholecystectomy procedures on pig livers using the da Vinci Research Kit (dVRK). Unlike existing datasets, it addresses a critical gap by providing comprehensive kinematic data, recordings of all pedal inputs, and offers a time-stamped record of the endoscope's movements. This expanded version also includes segmentation and keypoint annotations of images, enhancing its utility for computer vision applications. Contributed by seven surgeons with varied backgrounds and experience levels that are provided as a part of this expanded version, the dataset is an important new resource for surgical robotics research. It enables the development of advanced methods for evaluating surgeon skills, tools for providing better context awareness, and automation of surgical tasks. Our work overcomes the limitations of incomplete recordings and imprecise kinematic data found in other datasets. To demonstrate the potential of the dataset for advancing automation in surgical robotics, we introduce two models that predict clutch usage and camera activation, a 3D scene reconstruction example, and the results from our keypoint and segmentation models.

Paper Structure

This paper contains 22 sections, 1 equation, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Sample of the stereo endoscopic images.
  • Figure 2: A schematic of a connection between the Arduino, console pedals, and the ESU.
  • Figure 3: The setup shows how our custom-calibrated kinematics work. The transformations are shown based on the direction of the arrows, and eventually, they are used to find the transformation between the ECM tip and PSM tip.
  • Figure 4: An example of generating annotations with Track-Anything. Once the initial frame of the video clip (red box) is annotated, Track-Anything starts annotating the rest of the frames.
  • Figure 5: KeyPoints structure for both the FBF and the PCH tools.
  • ...and 5 more figures