Table of Contents
Fetching ...

SimEndoGS: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

Zhenya Yang, Kai Chen, Yonghao Long, Qi Dou

TL;DR

This work tackles the challenge of data-driven, physically plausible surgical scene simulation by learning a 3D Gaussian representation from stereo endoscopic videos and embedding physics through the Material Point Method. It introduces depth supervision and anisotropy regularization to combat overfitting and geometric artifacts, together with a Gaussian padding step to stabilize simulation. The framework integrates a Neo-Hookean tissue model within the MPM solver, updating Gaussian covariances via $\boldsymbol{\Sigma}^{\prime}=\mathbf{F}\boldsymbol{\Sigma}\mathbf{F}^{T}$ and using a PK1 stress $\mathbf{PK_1}=\mu(\mathbf{F}-\mathbf{F}^{T})+\lambda\log(J)\mathbf{F^{-T}}$, with $\mu=E/(2(1+\nu))$ and $\lambda=E\nu/((1+\nu)(1-2\nu))$ to control material stiffness. The method, termed SimEndoGS, reconstructs occlusion-free tissue and enables physically-based deformation in near real-time, achieving competitive reconstruction quality (PSNRs around $60$--$65$ dB) while delivering superior simulation realism compared to the EndoGS baseline. The approach promises scalable, automated generation of interactive surgical scenes, beneficial for education and robot learning by converting surgical videos into physics-informed simulations with efficient performance.

Abstract

Surgical scene simulation plays a crucial role in surgical education and simulator-based robot learning. Traditional approaches for creating these environments with surgical scene involve a labor-intensive process where designers hand-craft tissues models with textures and geometries for soft body simulations. This manual approach is not only time-consuming but also limited in the scalability and realism. In contrast, data-driven simulation offers a compelling alternative. It has the potential to automatically reconstruct 3D surgical scenes from real-world surgical video data, followed by the application of soft body physics. This area, however, is relatively uncharted. In our research, we introduce 3D Gaussian as a learnable representation for surgical scene, which is learned from stereo endoscopic video. To prevent over-fitting and ensure the geometrical correctness of these scenes, we incorporate depth supervision and anisotropy regularization into the Gaussian learning process. Furthermore, we apply the Material Point Method, which is integrated with physical properties, to the 3D Gaussians to achieve realistic scene deformations. Our method was evaluated on our collected in-house and public surgical videos datasets. Results show that it can reconstruct and simulate surgical scenes from endoscopic videos efficiently-taking only a few minutes to reconstruct the surgical scene-and produce both visually and physically plausible deformations at a speed approaching real-time. The results demonstrate great potential of our proposed method to enhance the efficiency and variety of simulations available for surgical education and robot learning.

SimEndoGS: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

TL;DR

This work tackles the challenge of data-driven, physically plausible surgical scene simulation by learning a 3D Gaussian representation from stereo endoscopic videos and embedding physics through the Material Point Method. It introduces depth supervision and anisotropy regularization to combat overfitting and geometric artifacts, together with a Gaussian padding step to stabilize simulation. The framework integrates a Neo-Hookean tissue model within the MPM solver, updating Gaussian covariances via and using a PK1 stress , with and to control material stiffness. The method, termed SimEndoGS, reconstructs occlusion-free tissue and enables physically-based deformation in near real-time, achieving competitive reconstruction quality (PSNRs around -- dB) while delivering superior simulation realism compared to the EndoGS baseline. The approach promises scalable, automated generation of interactive surgical scenes, beneficial for education and robot learning by converting surgical videos into physics-informed simulations with efficient performance.

Abstract

Surgical scene simulation plays a crucial role in surgical education and simulator-based robot learning. Traditional approaches for creating these environments with surgical scene involve a labor-intensive process where designers hand-craft tissues models with textures and geometries for soft body simulations. This manual approach is not only time-consuming but also limited in the scalability and realism. In contrast, data-driven simulation offers a compelling alternative. It has the potential to automatically reconstruct 3D surgical scenes from real-world surgical video data, followed by the application of soft body physics. This area, however, is relatively uncharted. In our research, we introduce 3D Gaussian as a learnable representation for surgical scene, which is learned from stereo endoscopic video. To prevent over-fitting and ensure the geometrical correctness of these scenes, we incorporate depth supervision and anisotropy regularization into the Gaussian learning process. Furthermore, we apply the Material Point Method, which is integrated with physical properties, to the 3D Gaussians to achieve realistic scene deformations. Our method was evaluated on our collected in-house and public surgical videos datasets. Results show that it can reconstruct and simulate surgical scenes from endoscopic videos efficiently-taking only a few minutes to reconstruct the surgical scene-and produce both visually and physically plausible deformations at a speed approaching real-time. The results demonstrate great potential of our proposed method to enhance the efficiency and variety of simulations available for surgical education and robot learning.
Paper Structure (10 sections, 6 equations, 4 figures, 1 table)

This paper contains 10 sections, 6 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: An overview of the proposed data-driven surgical simulation framework. It consists of automatic scene reconstruction and physically-based scene simulation using 3D Gaussians.
  • Figure 2: Qualitative evaluation of simulation performance. The external forces are indicated using red arrows. The corresponding simulation videos are included in the supplementary material.
  • Figure 3: Comparison with EndoGSzhu2024endogs. The comparison between ours method and EndoGS on reconstruction and simulation.
  • Figure 4: Ablation study. We compare our base simulation result and simulation results w/o depth supervision, gaussian padding or anisotropy regularization. The artifacts are highlighted using white dashed boxes.