Table of Contents
Fetching ...

\textit{4DSurf}: High-Fidelity Dynamic Scene Surface Reconstruction

Renjie Wu, Hongdong Li, Jose M. Alvarez, Miaomiao Liu

Abstract

This paper addresses the problem of dynamic scene surface reconstruction using Gaussian Splatting (GS), aiming to recover temporally consistent geometry. While existing GS-based dynamic surface reconstruction methods can yield superior reconstruction, they are typically limited to either a single object or objects with only small deformations, struggling to maintain temporally consistent surface reconstruction of large deformations over time. We propose ``\textit{4DSurf}'', a novel and unified framework for generic dynamic surface reconstruction that does not require specifying the number or types of objects in the scene, can handle large surface deformations and temporal inconsistency in reconstruction. The key innovation of our framework is the introduction of Gaussian deformations induced Signed Distance Function Flow Regularization that constrains the motion of Gaussians to align with the evolving surface. To handle large deformations, we introduce an Overlapping Segment Partitioning strategy that divides the sequence into overlapping segments with small deformations and incrementally passes geometric information across segments through the shared overlapping timestep. Experiments on two challenging dynamic scene datasets, Hi4D and CMU Panoptic, demonstrate that our method outperforms state-of-the-art surface reconstruction methods by 49\% and 19\% in Chamfer distance, respectively, and achieves superior temporal consistency under sparse-view settings.

\textit{4DSurf}: High-Fidelity Dynamic Scene Surface Reconstruction

Abstract

This paper addresses the problem of dynamic scene surface reconstruction using Gaussian Splatting (GS), aiming to recover temporally consistent geometry. While existing GS-based dynamic surface reconstruction methods can yield superior reconstruction, they are typically limited to either a single object or objects with only small deformations, struggling to maintain temporally consistent surface reconstruction of large deformations over time. We propose ``\textit{4DSurf}'', a novel and unified framework for generic dynamic surface reconstruction that does not require specifying the number or types of objects in the scene, can handle large surface deformations and temporal inconsistency in reconstruction. The key innovation of our framework is the introduction of Gaussian deformations induced Signed Distance Function Flow Regularization that constrains the motion of Gaussians to align with the evolving surface. To handle large deformations, we introduce an Overlapping Segment Partitioning strategy that divides the sequence into overlapping segments with small deformations and incrementally passes geometric information across segments through the shared overlapping timestep. Experiments on two challenging dynamic scene datasets, Hi4D and CMU Panoptic, demonstrate that our method outperforms state-of-the-art surface reconstruction methods by 49\% and 19\% in Chamfer distance, respectively, and achieves superior temporal consistency under sparse-view settings.

Paper Structure

This paper contains 13 sections, 6 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Left: Using only a sparse set of input videos. Right: Our approach sets a new state-of-the-art benchmark in surface reconstruction for dynamic scenes on the CMU Panoptic dataset Joo_2017_TPAMI, compared with recent dynamic surface reconstruction methods (Neural SDF-Flow mao2024neural, Sparse2DGS wu2025sparse2dgs, GauSTAR zheng2025gaustar, D-2DGS zhang2024dynamic, and ST-2DGS wang2024space).
  • Figure 2: Overview: (a) Overall Training Pipeline. We first divide the sequence into $N$ segments, each containing $K{+}1$ timesteps with one overlapping virtual timestep. For the $1^{\text{st}}$ segment, the initialization is derived from the visual hull reconstructed from all frames of its first timestep. After training the first segment, the Gaussians of virtual timestep serves as the initialization for training the next segment. Each segment maintains its own canonical space and Gaussian Velocity Field. (b) Gaussian Velocity Field. Given the Gaussian center $\bm{\mu}_i$ in the canonical space and a specific timestep $t$, the Gaussian Velocity Field $\mathcal{F}_{\bm{\theta}}(\cdot)$ predicts its velocity $\mathbf{v}(\bm{\mu}_i, t)$, angular velocity $\bm{\omega}(\bm{\mu}_i, t)$ and expansion velocity $\bm{e}(\bm{\mu}_i, t)$ at timestep $t$. These are then converted to position $\bm{\mu}_i^{t}$, rotation $\bm{q}_i^{t}$, and scale $\bm{\xi}_i^{t}$, which are fed into the differentiable rasterizer for image rendering. (c) SDF Approximation. Following previous works guedon2023sugarnewcombe2011kinectfusion, we compute the distance between the center and its corresponding depth point to estimate the signed distance.
  • Figure 3: Incremental Motion Tuning (IMT). After training the Gaussian Velocity Field of the $1^{\text{st}}$ segment, for the later $N^{\text{th}}$ segment ($N \ge 2$), its Gaussian Velocity Field $\mathcal{F}_{\bm{\theta}^{N}}$ is initialized from the $\bm{\theta}^{N-1}$ of previous segment and fine-tuned using LoRA $\Delta\bm{\theta}^{N}$.
  • Figure 4: Qualitative results on CMU Panoptic Joo_2017_TPAMI. We compare our methods with three baselines (Dynamic-2DGS zhang2024dynamic, Sparse2DGS wu2025sparse2dgs, FreeTimeGS wang2025freetimegs) at two timesteps of the Band1 and Ian3 scene. Bounding boxes highlight major differences.
  • Figure 5: Qualitative results on Hi4D yin2023hi4d. We compare our methods with three baselines (Dynamic-2DGS zhang2024dynamic, Sparse2DGS wu2025sparse2dgs, FreeTimeGS wang2025freetimegs) at two timesteps of the Basketball13 and Fight17 scene. Bounding boxes highlight major differences.
  • ...and 2 more figures