Table of Contents
Fetching ...

3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis

Zhicheng Lu, Xiang Guo, Le Hui, Tianrui Chen, Min Yang, Xiao Tang, Feng Zhu, Yuchao Dai

TL;DR

This work tackles dynamic view synthesis from monocular video by introducing 3D geometry-aware deformable Gaussian Splatting. It combines a Gaussian canonical field to capture static geometry with a deformation field that predicts per-Gaussian motion, rotation, and scale across time, aided by a sparse 3D convolution-based geometry feature extractor and a continuous 6D rotation representation. The method uses differentiable 3D Gaussian rasterization and a tailored density-control strategy, achieving state-of-the-art results on synthetic and real dynamic datasets with strong qualitative and quantitative gains. This approach improves 3D reconstruction and dynamic view synthesis while maintaining efficiency, though it relies on accurate camera poses and may struggle with very long or highly complex motion sequences.

Abstract

In this paper, we propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis. Existing neural radiance fields (NeRF) based solutions learn the deformation in an implicit manner, which cannot incorporate 3D scene geometry. Therefore, the learned deformation is not necessarily geometrically coherent, which results in unsatisfactory dynamic view synthesis and 3D dynamic reconstruction. Recently, 3D Gaussian Splatting provides a new representation of the 3D scene, building upon which the 3D geometry could be exploited in learning the complex 3D deformation. Specifically, the scenes are represented as a collection of 3D Gaussian, where each 3D Gaussian is optimized to move and rotate over time to model the deformation. To enforce the 3D scene geometry constraint during deformation, we explicitly extract 3D geometry features and integrate them in learning the 3D deformation. In this way, our solution achieves 3D geometry-aware deformation modeling, which enables improved dynamic view synthesis and 3D dynamic reconstruction. Extensive experimental results on both synthetic and real datasets prove the superiority of our solution, which achieves new state-of-the-art performance. The project is available at https://npucvr.github.io/GaGS/

3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis

TL;DR

This work tackles dynamic view synthesis from monocular video by introducing 3D geometry-aware deformable Gaussian Splatting. It combines a Gaussian canonical field to capture static geometry with a deformation field that predicts per-Gaussian motion, rotation, and scale across time, aided by a sparse 3D convolution-based geometry feature extractor and a continuous 6D rotation representation. The method uses differentiable 3D Gaussian rasterization and a tailored density-control strategy, achieving state-of-the-art results on synthetic and real dynamic datasets with strong qualitative and quantitative gains. This approach improves 3D reconstruction and dynamic view synthesis while maintaining efficiency, though it relies on accurate camera poses and may struggle with very long or highly complex motion sequences.

Abstract

In this paper, we propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis. Existing neural radiance fields (NeRF) based solutions learn the deformation in an implicit manner, which cannot incorporate 3D scene geometry. Therefore, the learned deformation is not necessarily geometrically coherent, which results in unsatisfactory dynamic view synthesis and 3D dynamic reconstruction. Recently, 3D Gaussian Splatting provides a new representation of the 3D scene, building upon which the 3D geometry could be exploited in learning the complex 3D deformation. Specifically, the scenes are represented as a collection of 3D Gaussian, where each 3D Gaussian is optimized to move and rotate over time to model the deformation. To enforce the 3D scene geometry constraint during deformation, we explicitly extract 3D geometry features and integrate them in learning the 3D deformation. In this way, our solution achieves 3D geometry-aware deformation modeling, which enables improved dynamic view synthesis and 3D dynamic reconstruction. Extensive experimental results on both synthetic and real datasets prove the superiority of our solution, which achieves new state-of-the-art performance. The project is available at https://npucvr.github.io/GaGS/
Paper Structure (25 sections, 17 equations, 16 figures, 5 tables)

This paper contains 25 sections, 17 equations, 16 figures, 5 tables.

Figures (16)

  • Figure 1: Geometric information exploited by different methods. a) Early dynamic NeRF methods such as DNeRFpumarola2021_dnerf_cvpr21 directly encode the coordinate $\mathbf{p}$ of the sample point as input feature for deformation network. b) Interpolation is used to fuse features from neighbouring grids and mulitscale interpolation enhances the local geometry information guo2022_NDVG_arxivliu2023_robust_CVPRfang2022_TANV_arxivturki2023_SUDS_CVPR. c) We propose to voxelize a set of Gaussian distributions and use a sparse convolution network to extract geometry-aware features for deformation learning.
  • Figure 2: The pipeline of our proposed 3D geometry-aware deformable Gaussian splitting. In the Gaussian canonical field, we reconstruct a static scene in canonical space using 3D Gaussian distributions. We extract positional features using an MLP, as well as local geometric features using a 3D U-Net, fused by another MLP to form the geometry-aware features. In the deformation field, taking the geometry-aware features and timestamp $t$, an MLP estimates the 3D Gaussian deformation, which transfers the canonical 3D Gaussian distributions to timestamp $t$. Finally, a rasterizer renders the transformed 3D Gaussian to images.
  • Figure 3: Our density control is designed for dynamic scenes. We control the densification of Gaussian distributions according to their transformed parameters at timestamp $t$ rather than parameters at canonical space.
  • Figure 4: Qualitative comparisons between baselines and our method on the synthetic dataset.
  • Figure 5: Qualitative comparisons between baselines and our method on the HyperNeRF real datasetpark2021hypernerf.
  • ...and 11 more figures