Table of Contents
Fetching ...

EndoLRMGS: Complete Endoscopic Scene Reconstruction combining Large Reconstruction Modelling and Gaussian Splatting

Xu Wang, Shuai Zhang, Baoru Huang, Danail Stoyanov, Evangelos B. Mazomenos

TL;DR

This work tackles the problem of complete endoscopic scene reconstruction for robot-assisted surgery by addressing depth discontinuities and occlusions that hinder existing methods. It introduces EndoLRMGS, which combines Large Reconstruction Modelling for rigid surgical tools with Gaussian Splatting for deformable tissue, and adds Orthogonal Perspective Joint Projection Optimization to align scale and pose between the two representations. The approach uses DEVA for per-tool segmentation, EndoGaussian for tissue surfaces, and LRM for detailed tool geometry, with color and depth losses to enforce consistency and an OPjPO pipeline to jointly optimize scale and position. Experiments on four surgical videos across three public datasets demonstrate state-of-the-art performance in both tool reconstruction (IoU ≈ 81.23%) and tissue reconstruction (PSNR/SSIM/LPIPS improvements), highlighting the method’s effectiveness in producing complete, high-fidelity 3D reconstructions that include occluded regions. Overall, EndoLRMGS provides a practical pathway to accurate, watertight scene models in endoscopic surgery, enabling improved navigation, visualization, and automation in robot-assisted procedures.

Abstract

Complete reconstruction of surgical scenes is crucial for robot-assisted surgery (RAS). Deep depth estimation is promising but existing works struggle with depth discontinuities, resulting in noisy predictions at object boundaries and do not achieve complete reconstruction omitting occluded surfaces. To address these issues we propose EndoLRMGS, that combines Large Reconstruction Modelling (LRM) and Gaussian Splatting (GS), for complete surgical scene reconstruction. GS reconstructs deformable tissues and LRM generates 3D models for surgical tools while position and scale are subsequently optimized by introducing orthogonal perspective joint projection optimization (OPjPO) to enhance accuracy. In experiments on four surgical videos from three public datasets, our method improves the Intersection-over-union (IoU) of tool 3D models in 2D projections by>40%. Additionally, EndoLRMGS improves the PSNR of the tools projection from 3.82% to 11.07%. Tissue rendering quality also improves, with PSNR increasing from 0.46% to 49.87%, and SSIM from 1.53% to 29.21% across all test videos.

EndoLRMGS: Complete Endoscopic Scene Reconstruction combining Large Reconstruction Modelling and Gaussian Splatting

TL;DR

This work tackles the problem of complete endoscopic scene reconstruction for robot-assisted surgery by addressing depth discontinuities and occlusions that hinder existing methods. It introduces EndoLRMGS, which combines Large Reconstruction Modelling for rigid surgical tools with Gaussian Splatting for deformable tissue, and adds Orthogonal Perspective Joint Projection Optimization to align scale and pose between the two representations. The approach uses DEVA for per-tool segmentation, EndoGaussian for tissue surfaces, and LRM for detailed tool geometry, with color and depth losses to enforce consistency and an OPjPO pipeline to jointly optimize scale and position. Experiments on four surgical videos across three public datasets demonstrate state-of-the-art performance in both tool reconstruction (IoU ≈ 81.23%) and tissue reconstruction (PSNR/SSIM/LPIPS improvements), highlighting the method’s effectiveness in producing complete, high-fidelity 3D reconstructions that include occluded regions. Overall, EndoLRMGS provides a practical pathway to accurate, watertight scene models in endoscopic surgery, enabling improved navigation, visualization, and automation in robot-assisted procedures.

Abstract

Complete reconstruction of surgical scenes is crucial for robot-assisted surgery (RAS). Deep depth estimation is promising but existing works struggle with depth discontinuities, resulting in noisy predictions at object boundaries and do not achieve complete reconstruction omitting occluded surfaces. To address these issues we propose EndoLRMGS, that combines Large Reconstruction Modelling (LRM) and Gaussian Splatting (GS), for complete surgical scene reconstruction. GS reconstructs deformable tissues and LRM generates 3D models for surgical tools while position and scale are subsequently optimized by introducing orthogonal perspective joint projection optimization (OPjPO) to enhance accuracy. In experiments on four surgical videos from three public datasets, our method improves the Intersection-over-union (IoU) of tool 3D models in 2D projections by>40%. Additionally, EndoLRMGS improves the PSNR of the tools projection from 3.82% to 11.07%. Tissue rendering quality also improves, with PSNR increasing from 0.46% to 49.87%, and SSIM from 1.53% to 29.21% across all test videos.

Paper Structure

This paper contains 12 sections, 8 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Reconstruction results from depth estimation shiBidirectionalSemiSupervisedDualBranch2023, single view reconstruction hongLRMLARGERECONSTRUCTION2024, and the proposed EndoLRMGS. EndoLRMGS reconstructs a complete and clear 3D surgical scene, including occluded tissue and closed surgical tools. Side views reveal that noisy points that should not appear in a faithful reconstruction, are eliminated.
  • Figure 2: The EndoLRMGS framework includes surgical tool segmentation with unique mask assignment, separate reconstruction of surgical tools and tissue, and the solution of scale and position uncertainties through the proposed OPjPO method.
  • Figure 3: The OPjPO framework for solving scale factor and position uncertainty. The scale factor of a tool 3D model is first determined by orthogonal projection, and then its spatial position is determined by perspective projection.
  • Figure 4: 3D Reconstruction of Shi. et al. and EndoLRMGS in different situations: deformable tissue and moving surgical tools (StereoMIS and EndoNerf) and moving camera (SCARED).
  • Figure 5: Comparison of 3D surgical tool reconstruction and 2D projections across three datasets. Columns: (1) RGB images, (2) BundleSDF 3D reconstruction, (3) BundleSDF projection with ground truth masks, (4) LRM 3D reconstruction, (5) LRM projection with ground truth masks, (6) Optimized projection using our method.