Table of Contents
Fetching ...

EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting

Lingting Zhu, Zhao Wang, Jiahao Cui, Zhenchao Jin, Guying Lin, Lequan Yu

TL;DR

EndoGS introduces a deformable 3D reconstruction framework for endoscopic tissues using Gaussian Splatting, extending to a 4D representation with time-dependent Gaussian deformations. It combines static Gaussians with deformation fields, leverages six HexPlane feature planes, and trains with depth Maps and tool masks under spatiotemporal weighting, plus surface-aligned regularization to improve geometry. The approach achieves high rendering fidelity and real-time speeds on DaVinci robotic surgery videos, outperforming prior methods like EndoNeRF and ForPlane. This work enables more accurate and efficient 3D tissue reconstruction from monocular videos, with potential benefits for surgical AR, education, and robot learning.

Abstract

Surgical 3D reconstruction is a critical area of research in robotic surgery, with recent works adopting variants of dynamic radiance fields to achieve success in 3D reconstruction of deformable tissues from single-viewpoint videos. However, these methods often suffer from time-consuming optimization or inferior quality, limiting their adoption in downstream tasks. Inspired by 3D Gaussian Splatting, a recent trending 3D representation, we present EndoGS, applying Gaussian Splatting for deformable endoscopic tissue reconstruction. Specifically, our approach incorporates deformation fields to handle dynamic scenes, depth-guided supervision with spatial-temporal weight masks to optimize 3D targets with tool occlusion from a single viewpoint, and surface-aligned regularization terms to capture the much better geometry. As a result, EndoGS reconstructs and renders high-quality deformable endoscopic tissues from a single-viewpoint video, estimated depth maps, and labeled tool masks. Experiments on DaVinci robotic surgery videos demonstrate that EndoGS achieves superior rendering quality. Code is available at https://github.com/HKU-MedAI/EndoGS.

EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting

TL;DR

EndoGS introduces a deformable 3D reconstruction framework for endoscopic tissues using Gaussian Splatting, extending to a 4D representation with time-dependent Gaussian deformations. It combines static Gaussians with deformation fields, leverages six HexPlane feature planes, and trains with depth Maps and tool masks under spatiotemporal weighting, plus surface-aligned regularization to improve geometry. The approach achieves high rendering fidelity and real-time speeds on DaVinci robotic surgery videos, outperforming prior methods like EndoNeRF and ForPlane. This work enables more accurate and efficient 3D tissue reconstruction from monocular videos, with potential benefits for surgical AR, education, and robot learning.

Abstract

Surgical 3D reconstruction is a critical area of research in robotic surgery, with recent works adopting variants of dynamic radiance fields to achieve success in 3D reconstruction of deformable tissues from single-viewpoint videos. However, these methods often suffer from time-consuming optimization or inferior quality, limiting their adoption in downstream tasks. Inspired by 3D Gaussian Splatting, a recent trending 3D representation, we present EndoGS, applying Gaussian Splatting for deformable endoscopic tissue reconstruction. Specifically, our approach incorporates deformation fields to handle dynamic scenes, depth-guided supervision with spatial-temporal weight masks to optimize 3D targets with tool occlusion from a single viewpoint, and surface-aligned regularization terms to capture the much better geometry. As a result, EndoGS reconstructs and renders high-quality deformable endoscopic tissues from a single-viewpoint video, estimated depth maps, and labeled tool masks. Experiments on DaVinci robotic surgery videos demonstrate that EndoGS achieves superior rendering quality. Code is available at https://github.com/HKU-MedAI/EndoGS.
Paper Structure (12 sections, 9 equations, 4 figures, 1 table)

This paper contains 12 sections, 9 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: The overview of our EndoGS pipeline. Given 3D Gaussians, we use the mean and the time as input to compute features by querying multi-resolution voxel planes. A single MLP is used to obtain the deformation of the Gaussians. With differentiable rasterization, the rendered images and depth maps are obtained and we use ground truth images, depth maps and the tool masks to provide the supervision.
  • Figure 2: Qualitative results on scene “traction” at different timesteps.
  • Figure 3: Ablation on spatial TV loss. We show rendering frames w/ and w/o spatial TV loss on scene "cutting tissues twice".
  • Figure 4: Ablation on Surface-Aligned regularization. We show reconstruction results w/o and w/ Surface-Aligned regularization on scene "pulling soft tissues".