Advancing Dense Endoscopic Reconstruction with Gaussian Splatting-driven Surface Normal-aware Tracking and Mapping
Yiming Huang, Beilei Cui, Long Bai, Zhen Chen, Jinlin Wu, Zhen Li, Hongbin Liu, Hongliang Ren
TL;DR
Endo-2DTAM introduces a real-time endoscopic SLAM system that leverages 2D Gaussian Splatting to deliver geometry-accurate reconstructions with high-quality novel-view rendering. By embedding surface normal information into tracking and mapping, and employing pose-consistent keyframe sampling and BA, it addresses multi-view depth inconsistencies inherent in 3DGS-based approaches. The method achieves state-of-the-art depth reconstruction on public endoscopic data ($1.87\pm0.63$ mm RMSE) while maintaining real-time performance, and demonstrates robust visually faithful renderings and accurate surface normals. This work advances intraoperative visualization and has potential to improve surgical navigation and planning in minimally invasive procedures.
Abstract
Simultaneous Localization and Mapping (SLAM) is essential for precise surgical interventions and robotic tasks in minimally invasive procedures. While recent advancements in 3D Gaussian Splatting (3DGS) have improved SLAM with high-quality novel view synthesis and fast rendering, these systems struggle with accurate depth and surface reconstruction due to multi-view inconsistencies. Simply incorporating SLAM and 3DGS leads to mismatches between the reconstructed frames. In this work, we present Endo-2DTAM, a real-time endoscopic SLAM system with 2D Gaussian Splatting (2DGS) to address these challenges. Endo-2DTAM incorporates a surface normal-aware pipeline, which consists of tracking, mapping, and bundle adjustment modules for geometrically accurate reconstruction. Our robust tracking module combines point-to-point and point-to-plane distance metrics, while the mapping module utilizes normal consistency and depth distortion to enhance surface reconstruction quality. We also introduce a pose-consistent strategy for efficient and geometrically coherent keyframe sampling. Extensive experiments on public endoscopic datasets demonstrate that Endo-2DTAM achieves an RMSE of $1.87\pm 0.63$ mm for depth reconstruction of surgical scenes while maintaining computationally efficient tracking, high-quality visual appearance, and real-time rendering. Our code will be released at github.com/lastbasket/Endo-2DTAM.
