Table of Contents
Fetching ...

Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction

Decai Chen, Brianne Oberson, Ingo Feldmann, Oliver Schreer, Anna Hilsmann, Peter Eisert

TL;DR

This work proposes AT-GS, a novel method for reconstructing high-quality dynamic surfaces from multi-view videos through per-frame incremental optimization that introduces a unified and adaptive gradient-aware densification strategy that integrates the strengths of conventional cloning and splitting techniques.

Abstract

3D Gaussian Splatting has recently achieved notable success in novel view synthesis for dynamic scenes and geometry reconstruction in static scenes. Building on these advancements, early methods have been developed for dynamic surface reconstruction by globally optimizing entire sequences. However, reconstructing dynamic scenes with significant topology changes, emerging or disappearing objects, and rapid movements remains a substantial challenge, particularly for long sequences. To address these issues, we propose AT-GS, a novel method for reconstructing high-quality dynamic surfaces from multi-view videos through per-frame incremental optimization. To avoid local minima across frames, we introduce a unified and adaptive gradient-aware densification strategy that integrates the strengths of conventional cloning and splitting techniques. Additionally, we reduce temporal jittering in dynamic surfaces by ensuring consistency in curvature maps across consecutive frames. Our method achieves superior accuracy and temporal coherence in dynamic surface reconstruction, delivering high-fidelity space-time novel view synthesis, even in complex and challenging scenes. Extensive experiments on diverse multi-view video datasets demonstrate the effectiveness of our approach, showing clear advantages over baseline methods. Project page: \url{https://fraunhoferhhi.github.io/AT-GS}

Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction

TL;DR

This work proposes AT-GS, a novel method for reconstructing high-quality dynamic surfaces from multi-view videos through per-frame incremental optimization that introduces a unified and adaptive gradient-aware densification strategy that integrates the strengths of conventional cloning and splitting techniques.

Abstract

3D Gaussian Splatting has recently achieved notable success in novel view synthesis for dynamic scenes and geometry reconstruction in static scenes. Building on these advancements, early methods have been developed for dynamic surface reconstruction by globally optimizing entire sequences. However, reconstructing dynamic scenes with significant topology changes, emerging or disappearing objects, and rapid movements remains a substantial challenge, particularly for long sequences. To address these issues, we propose AT-GS, a novel method for reconstructing high-quality dynamic surfaces from multi-view videos through per-frame incremental optimization. To avoid local minima across frames, we introduce a unified and adaptive gradient-aware densification strategy that integrates the strengths of conventional cloning and splitting techniques. Additionally, we reduce temporal jittering in dynamic surfaces by ensuring consistency in curvature maps across consecutive frames. Our method achieves superior accuracy and temporal coherence in dynamic surface reconstruction, delivering high-fidelity space-time novel view synthesis, even in complex and challenging scenes. Extensive experiments on diverse multi-view video datasets demonstrate the effectiveness of our approach, showing clear advantages over baseline methods. Project page: \url{https://fraunhoferhhi.github.io/AT-GS}

Paper Structure

This paper contains 20 sections, 4 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Comparison of our proposed method on a scene from the DNA-Rendering dataset 2023dnarendering. The training time and LPIPS scores (lower is better) are averaged across the sequence. Our approach not only achieves photorealistic novel view rendering with significantly reduced training time compared to the recent method xu20244k4d, but also produces finer surface meshes, surpassing the state-of-the-art results wang2023neus2.
  • Figure 2: Pipeline of Our Method. Starting with the Gaussian surfels from the previous frame ($t-1$), we first estimate their coarse translation and rotation to align with the current frame ($t$). Subsequently, we optimize all Gaussian attributes, incorporating our gradient-guided densification strategy. For each training view, we render opacity, depth, normal, and color maps ( from top to bottom in the dashed box) using differentiable tile-based rasterization. Additionally, we predict optical flow between consecutive frames, which warps the rendered normal map from frame $t-1$ to frame $t$. We then ensure temporal consistency of the underlying surface by comparing curvature maps derived from the warped and rendered normal maps. Furthermore, we apply photometric loss, depth-normal consistency loss, and mask loss for supervision. Finally, Poisson reconstruction is employed to generate a mesh from the unprojected depth and normal maps.
  • Figure 3: Gradient-aware splitting. (a) 1D illustration of sampling PDFs (normal distributions), which determine the positions of new split Gaussians. (b) Conventional splitting (red dashed ellipse) samples a new, smaller Gaussian from a multivariate normal distribution centered at the original Gaussian, with standard deviations equal to its scales. In contrast, our approach (green solid ellipse) adaptively guides the sampling using view-space positional gradients, while preserving the size of the original Gaussian.
  • Figure 4: Qualitative comparison of novel view synthesis on the DNA-Rendering dataset 2023dnarendering.
  • Figure 5: Qualitative comparison of novel view synthesis on the NHR dataset nhr.
  • ...and 4 more figures