DualVision ArthroNav: Investigating Opportunities to Enhance Localization and Reconstruction in Image-based Arthroscopy Navigation via External Cameras
Hongchao Shu, Lalithkumar Seenivasan, Mingxu Liu, Yunseo Hwang, Yu-Chun Ku, Jonathan Knopf, Alejandro Martin-Gomez, Mehran Armand, Mathias Unberath
TL;DR
Monocular arthroscopy suffers from scale ambiguity and drift, while optical trackers can disrupt workflow. The authors introduce DualVision ArthroNav, a fully vision-based intraoperative navigation system that fuses a rigidly mounted external camera with the arthroscope to obtain absolute localization (via external visual odometry) and dense intra-articular reconstruction (via monocular video). A local-to-global registration step recovers scale and aligns local reconstructions with a global model, aided by a multi-camera calibration and hand-eye alignment strategy; the pipeline leverages ORBSLAM for external localization and 3D Gaussian Splatting for dense meshes. Quantitative results show mean global tracking errors around $ATE\approx 1.08$ mm and $RTE\approx 1.09$ mm, dense surface error $TRE\approx 2.16$ mm, and rendering fidelity with PSNR ≈ 22.19 and SSIM ≈ 0.69, supporting practical, low-cost, vision-based guidance.
Abstract
Arthroscopic procedures can greatly benefit from navigation systems that enhance spatial awareness, depth perception, and field of view. However, existing optical tracking solutions impose strict workspace constraints and disrupt surgical workflow. Vision-based alternatives, though less invasive, often rely solely on the monocular arthroscope camera, making them prone to drift, scale ambiguity, and sensitivity to rapid motion or occlusion. We propose DualVision ArthroNav, a multi-camera arthroscopy navigation system that integrates an external camera rigidly mounted on the arthroscope. The external camera provides stable visual odometry and absolute localization, while the monocular arthroscope video enables dense scene reconstruction. By combining these complementary views, our system resolves the scale ambiguity and long-term drift inherent in monocular SLAM and ensures robust relocalization. Experiments demonstrate that our system effectively compensates for calibration errors, achieving an average absolute trajectory error of 1.09 mm. The reconstructed scenes reach an average target registration error of 2.16 mm, with high visual fidelity (SSIM = 0.69, PSNR = 22.19). These results indicate that our system provides a practical and cost-efficient solution for arthroscopic navigation, bridging the gap between optical tracking and purely vision-based systems, and paving the way toward clinically deployable, fully vision-based arthroscopic guidance.
