Robust Surgical Tool Tracking with Pixel-based Probabilities for Projected Geometric Primitives
Christopher D'Ambrosia, Florian Richter, Zih-Yun Chiu, Nikhil Shinde, Fei Liu, Henrik I. Christensen, Michael C. Yip
TL;DR
The paper addresses the challenge of robust surgical tool localization under uncertain camera-to-base transforms by modeling a lumped error $\mathbf{E} \in SE(3)$ within the forward-kinematics framework $^{c}\mathbf{T}_j = \mathbf{E} \prod^{j}_{i=1} {^{i-1}\mathbf{T}_{i}}(\tilde{q}_{i})$ and leveraging image-based insertion-shaft observations. It introduces a deep-learning–assisted detection of the insertion-shaft via SOLD2 and compares four observation models—two operating in polar line space and two pixel-based—to update a Particle Filter estimating tool pose. Experiments on structured and deformable-tissue datasets show that pixel-based line/endpoint observations achieve lower 2D localization errors than a Canny baseline, demonstrating improved robustness in challenging lighting and occlusion conditions. The approach advances real-time, probabilistic tool tracking for autonomous surgical tasks by effectively leveraging the insertion-shaft as a geometric primitive and combining it with learned detection and Bayesian inference. The main limitation noted is the current low inference speed of SOLD2 (1–2 fps), suggesting future work on runtime optimization for real-time clinical deployment.
Abstract
Controlling robotic manipulators via visual feedback requires a known coordinate frame transformation between the robot and the camera. Uncertainties in mechanical systems as well as camera calibration create errors in this coordinate frame transformation. These errors result in poor localization of robotic manipulators and create a significant challenge for applications that rely on precise interactions between manipulators and the environment. In this work, we estimate the camera-to-base transform and joint angle measurement errors for surgical robotic tools using an image based insertion-shaft detection algorithm and probabilistic models. We apply our proposed approach in both a structured environment as well as an unstructured environment and measure to demonstrate the efficacy of our methods.
