Table of Contents
Fetching ...

Realistic Surgical Simulation from Monocular Videos

Kailing Wang, Chen Yang, Keyang Zhao, Xiaokang Yang, Wei Shen

TL;DR

SurgiSim tackles automatic surgical simulation from monocular videos by constructing a geometrically consistent canonical scene using 3D Gaussian Splatting and a deformation field, then running physics-based simulations with a visco-elastic tissue model. It introduces a multi-stage optimization regime with trajectory and anisotropic regularization to maintain geometry over time, and a video-guided, Maxwell-inspired visco-elastic parameter estimation within a differentiable MPM framework. The approach yields realistic tissue deformations under surgical interactions, validated by quantitative metrics and a user study showing strong preferences over baselines. The work automates simulation-ready environment creation and parameter inference, offering potential for enhanced surgical training, planning, and robotics. Overall, SurgiSim demonstrates how monocular videos can be transformed into high-fidelity, physics-driven surgical simulations through integrated geometric, physical, and data-guided components.

Abstract

This paper tackles the challenge of automatically performing realistic surgical simulations from readily available surgical videos. Recent efforts have successfully integrated physically grounded dynamics within 3D Gaussians to perform high-fidelity simulations in well-reconstructed simulation environments from static scenes. However, they struggle with the geometric inconsistency in reconstructing simulation environments and unrealistic physical deformations in simulations of soft tissues when it comes to dynamic and complex surgical processes. In this paper, we propose SurgiSim, a novel automatic simulation system to overcome these limitations. To build a surgical simulation environment, we maintain a canonical 3D scene composed of 3D Gaussians coupled with a deformation field to represent a dynamic surgical scene. This process involves a multi-stage optimization with trajectory and anisotropic regularization, enhancing the geometry consistency of the canonical scene, which serves as the simulation environment. To achieve realistic physical simulations in this environment, we implement a Visco-Elastic deformation model based on the Maxwell model, effectively restoring the complex deformations of tissues. Additionally, we infer the physical parameters of tissues by minimizing the discrepancies between the input video and simulation results guided by estimated tissue motion, ensuring realistic simulation outcomes. Experiments on various surgical scenarios and interactions demonstrate SurgiSim's ability to perform realistic simulation of soft tissues among surgical procedures, showing its enormous potential for enhancing surgical training, planning, and robotic surgery systems. The project page is at https://namaenashibot.github.io/SurgiSim/.

Realistic Surgical Simulation from Monocular Videos

TL;DR

SurgiSim tackles automatic surgical simulation from monocular videos by constructing a geometrically consistent canonical scene using 3D Gaussian Splatting and a deformation field, then running physics-based simulations with a visco-elastic tissue model. It introduces a multi-stage optimization regime with trajectory and anisotropic regularization to maintain geometry over time, and a video-guided, Maxwell-inspired visco-elastic parameter estimation within a differentiable MPM framework. The approach yields realistic tissue deformations under surgical interactions, validated by quantitative metrics and a user study showing strong preferences over baselines. The work automates simulation-ready environment creation and parameter inference, offering potential for enhanced surgical training, planning, and robotics. Overall, SurgiSim demonstrates how monocular videos can be transformed into high-fidelity, physics-driven surgical simulations through integrated geometric, physical, and data-guided components.

Abstract

This paper tackles the challenge of automatically performing realistic surgical simulations from readily available surgical videos. Recent efforts have successfully integrated physically grounded dynamics within 3D Gaussians to perform high-fidelity simulations in well-reconstructed simulation environments from static scenes. However, they struggle with the geometric inconsistency in reconstructing simulation environments and unrealistic physical deformations in simulations of soft tissues when it comes to dynamic and complex surgical processes. In this paper, we propose SurgiSim, a novel automatic simulation system to overcome these limitations. To build a surgical simulation environment, we maintain a canonical 3D scene composed of 3D Gaussians coupled with a deformation field to represent a dynamic surgical scene. This process involves a multi-stage optimization with trajectory and anisotropic regularization, enhancing the geometry consistency of the canonical scene, which serves as the simulation environment. To achieve realistic physical simulations in this environment, we implement a Visco-Elastic deformation model based on the Maxwell model, effectively restoring the complex deformations of tissues. Additionally, we infer the physical parameters of tissues by minimizing the discrepancies between the input video and simulation results guided by estimated tissue motion, ensuring realistic simulation outcomes. Experiments on various surgical scenarios and interactions demonstrate SurgiSim's ability to perform realistic simulation of soft tissues among surgical procedures, showing its enormous potential for enhancing surgical training, planning, and robotic surgery systems. The project page is at https://namaenashibot.github.io/SurgiSim/.

Paper Structure

This paper contains 39 sections, 19 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: An overview of SurgiSim. From an input surgical video, we first reconstruct a 3D scene as the simulation environment, then estimate its physical parameters based on a Visco-Elastic deformation model, and finally perform realistic simulation dynamics in it. The green parts are surgical tool masks and the blue arrows indicate the directions of applied forces in simulations.
  • Figure 2: Illustration of our physical parameter estimation. SurgiSim automatically infers physical parameters by minimizing the discrepancy between rendered simulation results and the input video through differentiable MPM and rasterization.
  • Figure 3: Visualization of simulations. We show the trajectory direction with a blue arrow and the motion of the tissues with a red line. The external force caused by the driving trajectory ends at 2.52s (63 frames), after which the tissue rebounds freely. SurgiSim consistently produces the most realistic simulation dynamics. Please refer to the supplementary material for more simulation results.
  • Figure 4: Y-T slices of simulation dynamics. The slices at $x = 400$ capture the motion of tissues being lifted and then released. The slices at $x = 520$ capture the oscillations after the rebound.
  • Figure 5: Ablation on the trajectory and geometric regulation. (a) is the input video as a reference. (b) is the result of multi-stage optimization. (c), (d) and (e) are the results of optimization without trajectory regularization, geometric regularization, and both, respectively.
  • ...and 1 more figures