Table of Contents
Fetching ...

Nested ResNet: A Vision-Based Method for Detecting the Sensing Area of a Drop-in Gamma Probe

Songyu Xu, Yicheng Hu, Jionglong Su, Daniel Elson, Baoru Huang

TL;DR

A three-branch deep learning framework to predict the sensing area of the probe using stereo laparoscopic images and a Nested ResNet architecture is introduced, providing surgeons with more accurate and reliable localisation of drop-in gamma probes in surgery.

Abstract

Purpose: Drop-in gamma probes are widely used in robotic-assisted minimally invasive surgery (RAMIS) for lymph node detection. However, these devices only provide audio feedback on signal intensity, lacking the visual feedback necessary for precise localisation. Previous work attempted to predict the sensing area location using laparoscopic images, but the prediction accuracy was unsatisfactory. Improvements are needed in the deep learning-based regression approach. Methods: We introduce a three-branch deep learning framework to predict the sensing area of the probe. Specifically, we utilise the stereo laparoscopic images as input for the main branch and develop a Nested ResNet architecture. The framework also incorporates depth estimation via transfer learning and orientation guidance through probe axis sampling. The combined features from each branch enhanced the accuracy of the prediction. Results: Our approach has been evaluated on a publicly available dataset, demonstrating superior performance over previous methods. In particular, our method resulted in a 22.10\% decrease in 2D mean error and a 41.67\% reduction in 3D mean error. Additionally, qualitative comparisons further demonstrated the improved precision of our approach. Conclusion: With extensive evaluation, our solution significantly enhances the accuracy and reliability of sensing area predictions. This advancement enables visual feedback during the use of the drop-in gamma probe in surgery, providing surgeons with more accurate and reliable localisation.}

Nested ResNet: A Vision-Based Method for Detecting the Sensing Area of a Drop-in Gamma Probe

TL;DR

A three-branch deep learning framework to predict the sensing area of the probe using stereo laparoscopic images and a Nested ResNet architecture is introduced, providing surgeons with more accurate and reliable localisation of drop-in gamma probes in surgery.

Abstract

Purpose: Drop-in gamma probes are widely used in robotic-assisted minimally invasive surgery (RAMIS) for lymph node detection. However, these devices only provide audio feedback on signal intensity, lacking the visual feedback necessary for precise localisation. Previous work attempted to predict the sensing area location using laparoscopic images, but the prediction accuracy was unsatisfactory. Improvements are needed in the deep learning-based regression approach. Methods: We introduce a three-branch deep learning framework to predict the sensing area of the probe. Specifically, we utilise the stereo laparoscopic images as input for the main branch and develop a Nested ResNet architecture. The framework also incorporates depth estimation via transfer learning and orientation guidance through probe axis sampling. The combined features from each branch enhanced the accuracy of the prediction. Results: Our approach has been evaluated on a publicly available dataset, demonstrating superior performance over previous methods. In particular, our method resulted in a 22.10\% decrease in 2D mean error and a 41.67\% reduction in 3D mean error. Additionally, qualitative comparisons further demonstrated the improved precision of our approach. Conclusion: With extensive evaluation, our solution significantly enhances the accuracy and reliability of sensing area predictions. This advancement enables visual feedback during the use of the drop-in gamma probe in surgery, providing surgeons with more accurate and reliable localisation.}

Paper Structure

This paper contains 15 sections, 4 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: The SENSEI® gamma probe working scenario and the definition of the sensing area
  • Figure 2: The architecture for sensing area detection: stereo laparoscopic images are input on the left; the main branch (b) uses a Nested ResNet for extracting features from the RGB images; branch (a) shows the depth feature extraction from depth maps using CNN layers; branch (c) shows the feature extraction from axis points using MLP. Features from these three branches are concatenated for predicting the sensing location.
  • Figure 3: The Nested ResNet design: (a) The overall architecture. (b-c) The standard bottleneck block (SBN) and the expanded bottleneck block (EBN) that form the residual encoding part in the Nested ResNet. The SBN keeps the same channel number $C$ while the EBN expands the channel number from $C$ to $4C$.
  • Figure 4: The Coffbea huang2023detecting dataset that contains three types of data: (a) RGB laparoscopic image, (b) ground truth sensing area provided in 2D coordinates, and (c) pre-generated depth maps.
  • Figure 5: Visualisation. Red dots indicate the location of the ground truth and green dots stand on the location of the prediction. The top row shows the results using previous method SL Regress huang2023detecting, and the bottom row shows the results of our methods