Table of Contents
Fetching ...

EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting

Chenxin Li, Brandon Y. Feng, Yifan Liu, Hengyu Liu, Cheng Wang, Weihao Yu, Yixuan Yuan

TL;DR

EndoSparse tackles the problem of sparse-view 3D reconstruction of endoscopic scenes by integrating 3D Gaussian Splatting with deformable scene modeling and distilling priors from vision foundation models. It introduces a diffusion-based appearance prior via Score Distillation Sampling and a geometric prior from Depth-Anything, enforced through differentiable depth rasterization, to regularize both appearance and geometry when only a few views are available. The approach yields real-time rendering with improved geometric accuracy and photorealism, outperforming state-of-the-art NeRF-based and 3D-GS methods on EndoNeRF-D and SCARED under sparse-view conditions. This work supports practical clinical deployment by enabling high-quality reconstructions from as few as 3 views.

Abstract

3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for photorealistic view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available, which is usually the case in real-world clinical scenarios. To tackle this {sparsity} challenge, we propose a framework leveraging the prior knowledge from multiple foundation models during the reconstruction process, dubbed as \textit{EndoSparse}. Experimental results indicate that our proposed strategy significantly improves the geometric and appearance quality under challenging sparse-view conditions, including using only three views. In rigorous benchmarking experiments against state-of-the-art methods, \textit{EndoSparse} achieves superior results in terms of accurate geometry, realistic appearance, and rendering efficiency, confirming the robustness to sparse-view limitations in endoscopic reconstruction. \textit{EndoSparse} signifies a steady step towards the practical deployment of neural 3D reconstruction in real-world clinical scenarios. Project page: https://endo-sparse.github.io/.

EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting

TL;DR

EndoSparse tackles the problem of sparse-view 3D reconstruction of endoscopic scenes by integrating 3D Gaussian Splatting with deformable scene modeling and distilling priors from vision foundation models. It introduces a diffusion-based appearance prior via Score Distillation Sampling and a geometric prior from Depth-Anything, enforced through differentiable depth rasterization, to regularize both appearance and geometry when only a few views are available. The approach yields real-time rendering with improved geometric accuracy and photorealism, outperforming state-of-the-art NeRF-based and 3D-GS methods on EndoNeRF-D and SCARED under sparse-view conditions. This work supports practical clinical deployment by enabling high-quality reconstructions from as few as 3 views.

Abstract

3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for photorealistic view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available, which is usually the case in real-world clinical scenarios. To tackle this {sparsity} challenge, we propose a framework leveraging the prior knowledge from multiple foundation models during the reconstruction process, dubbed as \textit{EndoSparse}. Experimental results indicate that our proposed strategy significantly improves the geometric and appearance quality under challenging sparse-view conditions, including using only three views. In rigorous benchmarking experiments against state-of-the-art methods, \textit{EndoSparse} achieves superior results in terms of accurate geometry, realistic appearance, and rendering efficiency, confirming the robustness to sparse-view limitations in endoscopic reconstruction. \textit{EndoSparse} signifies a steady step towards the practical deployment of neural 3D reconstruction in real-world clinical scenarios. Project page: https://endo-sparse.github.io/.
Paper Structure (12 sections, 6 equations, 3 figures, 1 table)

This paper contains 12 sections, 6 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: EndoSparse Overview. Within a 3D-GS scene reconstruction framework, we incorporate vision foundation models as effective regularizers of the 3D scene. We incorporate geometric prior knowledge from Depth-Anything depthanything and image appearance priors from Stable Diffusion rombach2022high, which provide valuable guidance signals for optimization at viewpoints without camera coverage.
  • Figure 2: Qualitative results of rendered images and depth maps on EndoNeRF-D.
  • Figure 3: Ablation analysis on EndoNeRF-D and SCARED datasets, with the results in terms of geometrical quality (top) and visual quality (bottom).