Table of Contents
Fetching ...

Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly 4D Reconstruction

Zhening Liu, Yingdong Hu, Xinjie Zhang, Rui Song, Jiawei Shao, Zehong Lin, Jun Zhang

TL;DR

This paper tackles online per-timestep 4D dynamic spatial reconstruction from causal multi-view inputs. It introduces DASS, a three-stage pipeline comprising selective inheritance, dynamics-aware shift, and error-guided densification to model temporal continuity, distinguish dynamic/static components, and adapt to emerging objects. The approach leverages learnable per-Gaussian masks and dual deformation networks to achieve fast convergence and high-fidelity rendering in real time, validated on N3DV and Meet Room datasets. Results show state-of-the-art online performance with real-time rendering, enabling practical live streaming of dynamic 4D scenes.

Abstract

The recent development of 3D Gaussian Splatting (3DGS) has led to great interest in 4D dynamic spatial reconstruction. Existing approaches mainly rely on full-length multi-view videos, while there has been limited exploration of online reconstruction methods that enable on-the-fly training and per-timestep streaming. Current 3DGS-based streaming methods treat the Gaussian primitives uniformly and constantly renew the densified Gaussians, thereby overlooking the difference between dynamic and static features as well as neglecting the temporal continuity in the scene. To address these limitations, we propose a novel three-stage pipeline for iterative streamable 4D dynamic spatial reconstruction. Our pipeline comprises a selective inheritance stage to preserve temporal continuity, a dynamics-aware shift stage to distinguish dynamic and static primitives and optimize their movements, and an error-guided densification stage to accommodate emerging objects. Our method achieves state-of-the-art performance in online 4D reconstruction, demonstrating the fastest on-the-fly training, superior representation quality, and real-time rendering capability. Project page: https://www.liuzhening.top/DASS

Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly 4D Reconstruction

TL;DR

This paper tackles online per-timestep 4D dynamic spatial reconstruction from causal multi-view inputs. It introduces DASS, a three-stage pipeline comprising selective inheritance, dynamics-aware shift, and error-guided densification to model temporal continuity, distinguish dynamic/static components, and adapt to emerging objects. The approach leverages learnable per-Gaussian masks and dual deformation networks to achieve fast convergence and high-fidelity rendering in real time, validated on N3DV and Meet Room datasets. Results show state-of-the-art online performance with real-time rendering, enabling practical live streaming of dynamic 4D scenes.

Abstract

The recent development of 3D Gaussian Splatting (3DGS) has led to great interest in 4D dynamic spatial reconstruction. Existing approaches mainly rely on full-length multi-view videos, while there has been limited exploration of online reconstruction methods that enable on-the-fly training and per-timestep streaming. Current 3DGS-based streaming methods treat the Gaussian primitives uniformly and constantly renew the densified Gaussians, thereby overlooking the difference between dynamic and static features as well as neglecting the temporal continuity in the scene. To address these limitations, we propose a novel three-stage pipeline for iterative streamable 4D dynamic spatial reconstruction. Our pipeline comprises a selective inheritance stage to preserve temporal continuity, a dynamics-aware shift stage to distinguish dynamic and static primitives and optimize their movements, and an error-guided densification stage to accommodate emerging objects. Our method achieves state-of-the-art performance in online 4D reconstruction, demonstrating the fastest on-the-fly training, superior representation quality, and real-time rendering capability. Project page: https://www.liuzhening.top/DASS

Paper Structure

This paper contains 27 sections, 13 equations, 10 figures, 6 tables, 1 algorithm.

Figures (10)

  • Figure 1: (Left) An overview of our three-stage pipeline, referred to as DASS. (Middle) Performance of 4D dynamic spatial reconstruction methods in terms of per-timestep training time and reconstruction quality (PSNR) on the N3DV dataset li2022neural. Online methods are represented by circles, while offline methods are indicated by triangles. The size of each marker is proportional to the rendering speed (FPS). Our method achieves the fastest online training speed, superior reconstruction quality, and real-time rendering capability. (Right) Visual comparisons of baseline methods, illustrating that our method preserves fine-grained details and recovers diverse dynamics.
  • Figure 2: Overview of our proposed DASS framework. The selective inheritance stage (Green) exploits the temporal continuity and adaptively preserves Gaussians from the previous timestep. The dynamics-aware shift stage (Blue) distinguishes the dynamic and static elements and optimizes the deformations. The error-guided densification stage (Yellow) detects and densifies the areas with weak reconstruction based on positional gradients and distortions. Variables highlighted in red represent learnable parameters in each stage, whose training is significantly lightweight compared to tuning all Gaussian parameters.
  • Figure 3: Histogram of Gaussian deformations in the flame steak scene of the N3DV dataset. The overall distribution of Gaussian deformations (Yellow) is widely spread, with the majority falling into the low deformation range (less than 0.01). The dynamic (Orange) and static (Blue) components display different deformation patterns, where significant transformations are mainly concentrated in the dynamic component and minimal transformations are primarily found in the static component.
  • Figure 4: Pipeline of obtaining the per-Gaussian dynamics mask. The process begins with optical flow and segmentation, which provide 2D priors to identify the dynamic object IDs. These IDs are then used to find the corresponding Gaussians in the 3D space.
  • Figure 5: (Left) Gaussians identified for densification based on positional gradients. (Right) Gaussians identified for densification using error-guidance distortion projection. This comparison verifies that our proposed strategy effectively prioritizes emerging objects and achieves targeted compensations. While the vanilla densification strategy requires multiple optimization steps to fully recover the scene, our method concentrates on high-distortion areas with emerging objects, enhancing the computational efficiency.
  • ...and 5 more figures