Table of Contents
Fetching ...

Hybrid Deep Reinforcement Learning for Radio Tracer Localisation in Robotic-assisted Radioguided Surgery

Hanyi Zhang, Kaizhong Deng, Zhaoyang Jacopo Hu, Baoru Huang, Daniel S. Elson

TL;DR

The paper addresses the challenge of autonomous radiotracer localization in robot-assisted radioguided surgery, where traditional gamma-probe guidance depends heavily on operator expertise. It introduces a two-phase hybrid method that first uses adaptive grid-based scanning to provide directional priors and then employs a PPO-based DRL agent, augmented by an Angle Prediction Module and a Corrected Solid Angle Model, to precisely navigate to the radiotracer. Key contributions include the Phase I adaptive scanning protocol with a 5x5 grid and progressively narrowed search, the Phase II DRL framework with a multi-component reward and angle estimation, and extensive validation showing improved accuracy, efficiency, and robustness in simulation (3,200 runs) and real-world tests on the dVRK (80% success). The work has practical implications for reducing operator dependency and increasing procedural consistency in radioguided surgeries, paving the way for broader adoption of automated radiotracer localization.

Abstract

Radioguided surgery, such as sentinel lymph node biopsy, relies on the precise localization of radioactive targets by non-imaging gamma/beta detectors. Manual radioactive target detection based on visual display or audible indication of gamma level is highly dependent on the ability of the surgeon to track and interpret the spatial information. This paper presents a learning-based method to realize the autonomous radiotracer detection in robot-assisted surgeries by navigating the probe to the radioactive target. We proposed novel hybrid approach that combines deep reinforcement learning (DRL) with adaptive robotic scanning. The adaptive grid-based scanning could provide initial direction estimation while the DRL-based agent could efficiently navigate to the target utilising historical data. Simulation experiments demonstrate a 95% success rate, and improved efficiency and robustness compared to conventional techniques. Real-world evaluation on the da Vinci Research Kit (dVRK) further confirms the feasibility of the approach, achieving an 80% success rate in radiotracer detection. This method has the potential to enhance consistency, reduce operator dependency, and improve procedural accuracy in radioguided surgeries.

Hybrid Deep Reinforcement Learning for Radio Tracer Localisation in Robotic-assisted Radioguided Surgery

TL;DR

The paper addresses the challenge of autonomous radiotracer localization in robot-assisted radioguided surgery, where traditional gamma-probe guidance depends heavily on operator expertise. It introduces a two-phase hybrid method that first uses adaptive grid-based scanning to provide directional priors and then employs a PPO-based DRL agent, augmented by an Angle Prediction Module and a Corrected Solid Angle Model, to precisely navigate to the radiotracer. Key contributions include the Phase I adaptive scanning protocol with a 5x5 grid and progressively narrowed search, the Phase II DRL framework with a multi-component reward and angle estimation, and extensive validation showing improved accuracy, efficiency, and robustness in simulation (3,200 runs) and real-world tests on the dVRK (80% success). The work has practical implications for reducing operator dependency and increasing procedural consistency in radioguided surgeries, paving the way for broader adoption of automated radiotracer localization.

Abstract

Radioguided surgery, such as sentinel lymph node biopsy, relies on the precise localization of radioactive targets by non-imaging gamma/beta detectors. Manual radioactive target detection based on visual display or audible indication of gamma level is highly dependent on the ability of the surgeon to track and interpret the spatial information. This paper presents a learning-based method to realize the autonomous radiotracer detection in robot-assisted surgeries by navigating the probe to the radioactive target. We proposed novel hybrid approach that combines deep reinforcement learning (DRL) with adaptive robotic scanning. The adaptive grid-based scanning could provide initial direction estimation while the DRL-based agent could efficiently navigate to the target utilising historical data. Simulation experiments demonstrate a 95% success rate, and improved efficiency and robustness compared to conventional techniques. Real-world evaluation on the da Vinci Research Kit (dVRK) further confirms the feasibility of the approach, achieving an 80% success rate in radiotracer detection. This method has the potential to enhance consistency, reduce operator dependency, and improve procedural accuracy in radioguided surgeries.

Paper Structure

This paper contains 14 sections, 7 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Process of radiotracer detection during robot-assisted surgery. The ProGrasp forceps hold the drop-in gamma probe, which is used to detect radioactive lymph nodes. As the probe moves closer to the target, the increase in real-time CPS feedback indicates the probe's proximity to the radiotracer. The precise manipulation of the probe thereby aids the localization of sentinel lymph nodes during procedures such as SLNB.
  • Figure 2: Overview of the proposed DRL-based hybrid approach for radiotracer detection. The method is divided into two phases: Phase I involves adaptive scanning to systematically detect the direction of target and narrow the detection range, while Phase II employs DRL for precise localization of the radiotracer. Phase I provides initial data for DRL to estimate a rough target position for state space at the beginning stage, as well as direction guidance to reduce exploration in DRL. The performance comparison shows that the hybrid method combines the accuracy and robustness of scanning with the efficiency of DRL, resulting in an accurate and efficient approach. The hybrid method achieves a balance between the reliability of classical scanning and the efficiency of DRL.
  • Figure 3: The simulation environment consists of a Robot Platform and a Radioactivity Model. The Robot Platform (left) utilizes Surgical Gym to simulate the da Vinci PSM with PD controllers for joint positions, enabling efficient training through GPU-accelerated simulations. The Radioactivity Model (right) uses the Corrected Solid Angle Model to simulate the gamma probe’s response and fitted through experimental data.
  • Figure 4: Comparison of Actual and Predicted CPS: the predicted CPS values from the model closely match the actual CPS values collected during the experiments, demonstrating the accuracy of the model.
  • Figure 5: Comparison of three different reward settings on success rate (a) and episode length (b). The distance-based reward shows a lower success rate, longer episode, and slower learning, while the signal-based reward effectively guides the probe near the target but lacks stability. The composite approach achieves a balance between learning speed, stability, and overall performance.
  • ...and 1 more figures