SurgTrack: CAD-Free 3D Tracking of Real-world Surgical Instruments
Wenwu Guo, Jinlin Wu, Zhen Chen, Qingxiang Zhao, Miao Xu, Zhen Lei, Hongbin Liu
TL;DR
SurgTrack tackles the problem of 3D surgical instrument tracking without CAD models by introducing an Instrument Signed Distance Field (SDF) to perform CAD-free registration from RGB-D data, followed by a tracking stage that fuses current observations with historical poses via a posture memory pool and a posture graph to improve robustness under occlusion and weak texture. The method jointly optimizes depth, shape, and feature congruence through losses defined on the SDF, 3D correspondences, and 2D projections, beginning with a RANSAC-based rough pose and refined by graph-based optimization. The authors validate on Instrument3D and HO3D, introducing the Instrument3D dataset and achieving state-of-the-art ADD-S, ADD, and CD metrics, with ablations confirming the importance of occlusion/texture handling and memory-graph fusion. The work advances vision-based surgical navigation by enabling accurate, CAD-free 3D instrument tracking in realistic, partially occluded scenes, and provides code and data to support further research.
Abstract
Vision-based surgical navigation has received increasing attention due to its non-invasive, cost-effective, and flexible advantages. In particular, a critical element of the vision-based navigation system is tracking surgical instruments. Compared with 2D instrument tracking methods, 3D instrument tracking has broader value in clinical practice, but is also more challenging due to weak texture, occlusion, and lack of Computer-Aided Design (CAD) models for 3D registration. To solve these challenges, we propose the SurgTrack, a two-stage 3D instrument tracking method for CAD-free and robust real-world applications. In the first registration stage, we incorporate an Instrument Signed Distance Field (SDF) modeling the 3D representation of instruments, achieving CAD-freed 3D registration. Due to this, we can obtain the location and orientation of instruments in the 3D space by matching the video stream with the registered SDF model. In the second tracking stage, we devise a posture graph optimization module, leveraging the historical tracking results of the posture memory pool to optimize the tracking results and improve the occlusion robustness. Furthermore, we collect the Instrument3D dataset to comprehensively evaluate the 3D tracking of surgical instruments. The extensive experiments validate the superiority and scalability of our SurgTrack, by outperforming the state-of-the-arts with a remarkable improvement. The code and dataset are available at https://github.com/wenwucode/SurgTrack.
