EndoPerfect: High-Accuracy Monocular Depth Estimation and 3D Reconstruction for Endoscopic Surgery via NeRF-Stereo Fusion
Pengcheng Chen, Wenhao Li, Nicole Gunderson, Jeremy Ruthberg, Randall Bly, Zhenglong Sun, Waleed M. Abuzeid, Eric J. Seibel
TL;DR
EndoPerfect tackles the need for fast, radiation-free, submillimeter monocular depth estimation in endoscopic sinus surgery by introducing an iterative NeRF-based pipeline that uses NeRF as an intermediate representation, generates optimized novel stereo views, and applies depth-supervised refinement to produce dense 3D reconstructions. The method achieves point-to-point accuracy below $0.5$ mm and depth accuracy of $0.125 \pm 0.443$ mm, validated across synthetic, phantom, cadaver, and intraoperative data, and demonstrates faster performance than intraoperative CT for 100-frame sequences. Key contributions include a Nerfacto-based NeRF initialization, a gradient-driven novel view optimization that preserves epipolar geometry, and a DS-NeRF-inspired depth supervision loop with progressive baseline growth and geometric fusion. The results suggest EndoPerfect can serve as a practical iCT replacement in ESS, offering submillimeter accuracy with reduced radiation exposure and improved intraoperative efficiency, while future work aims at generalization to larger spaces and near-real-time processing.
Abstract
In endoscopic sinus surgery (ESS), intraoperative CT (iCT) offers valuable intraoperative assessment but is constrained by slow deployment and radiation exposure, limiting its clinical utility. Endoscope-based monocular 3D reconstruction is a promising alternative; however, existing techniques often struggle to achieve the submillimeter precision required for dense reconstruction. In this work, we propose an iterative online learning approach that leverages Neural Radiance Fields (NeRF) as an intermediate representation, enabling monocular depth estimation and 3D reconstruction without relying on prior medical data. Our method attains a point-to-point accuracy below 0.5 mm, with a demonstrated theoretical depth accuracy of 0.125 $\pm$ 0.443 mm. We validate our approach across synthetic, phantom, and real endoscopic scenarios, confirming its accuracy and reliability. These results underscore the potential of our pipeline as an iCT alternative, meeting the demanding submillimeter accuracy standards required in ESS.
