Free-DyGS: Camera-Pose-Free Scene Reconstruction for Dynamic Surgical Videos with Gaussian Splatting
Qian Li, Shuojue Yang, Daiyun Shen, Jimmy Bok Yan So, Jing Qin, Yueming Jin
TL;DR
Free-DyGS tackles the challenging problem of dynamic surgical scene reconstruction from endoscopic videos with unknown camera poses and tissue deformations. It introduces a Gaussian-Splatting (GS) based pipeline that initializes from a pre-trained Sparse Gaussian Regressor (SGR) and progressively expands the scene while jointly estimating deformations and $6$D camera poses in a frame-by-frame manner, aided by a Retrospective Deformation Recapitulation (RDR) strategy. Core innovations include the Generalizable Gaussian parameterization via SGR for fast initialization and scene expansion, a Partially Activated Flexible Deformation Model (PFDM) to reduce temporal coupling, and retrospective learning to preserve historical deformations across a length-$4$D sequence. Experiments on StereoMIS and Hamlyn show higher rendering fidelity and reduced training times compared with state-of-the-art methods, indicating strong potential for intraoperative navigation and surgical education.
Abstract
High-fidelity reconstruction of surgical scene is a fundamentally crucial task to support many applications, such as intra-operative navigation and surgical education. However, most existing methods assume the ideal surgical scenarios - either focus on dynamic reconstruction with deforming tissue yet assuming a given fixed camera pose, or allow endoscope movement yet reconstructing the static scenes. In this paper, we target at a more realistic yet challenging setup - free-pose reconstruction with a moving camera for highly dynamic surgical scenes. Meanwhile, we take the first step to introduce Gaussian Splitting (GS) technique to tackle this challenging setting and propose a novel GS-based framework for fast reconstruction, termed \textit{Free-DyGS}. Concretely, our model embraces a novel scene initialization in which a pre-trained Sparse Gaussian Regressor (SGR) can efficiently parameterize the initial attributes. For each subsequent frame, we propose to jointly optimize the deformation model and 6D camera poses in a frame-by-frame manner, easing training given the limited deformation differences between consecutive frames. A Scene Expansion scheme is followed to expand the GS model for the unseen regions introduced by the moving camera. Moreover, the framework is equipped with a novel Retrospective Deformation Recapitulation (RDR) strategy to preserve the entire-clip deformations throughout the frame-by-frame training scheme. The efficacy of the proposed Free-DyGS is substantiated through extensive experiments on two datasets: StereoMIS and Hamlyn datasets. The experimental outcomes underscore that Free-DyGS surpasses other advanced methods in both rendering accuracy and efficiency. Code will be available.
