Table of Contents
Fetching ...

1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Yuheng Yuan, Qiuhong Shen, Xingyi Yang, Xinchao Wang

TL;DR

Dynamic scene reconstruction with 4D Gaussian Splatting suffers from excessive storage and slow rendering due to temporal redundancy. The authors propose 4DGS-1K, combining a Spatial-Temporal Variation Score for pruning short-lived Gaussians with a temporal-filtering scheme using shared masks across key-frames to reduce computations. Empirical results on Neural 3D Video and D-NeRF datasets show about 41× storage reduction and 9× faster rasterization, achieving 1000+ FPS with comparable visual quality. This work enables practical real-time rendering of complex dynamic scenes and suggests a pathway toward universal compression for Gaussian-based dynamic representations.

Abstract

4D Gaussian Splatting (4DGS) has recently gained considerable attention as a method for reconstructing dynamic scenes. Despite achieving superior quality, 4DGS typically requires substantial storage and suffers from slow rendering speed. In this work, we delve into these issues and identify two key sources of temporal redundancy. (Q1) \textbf{Short-Lifespan Gaussians}: 4DGS uses a large portion of Gaussians with short temporal span to represent scene dynamics, leading to an excessive number of Gaussians. (Q2) \textbf{Inactive Gaussians}: When rendering, only a small subset of Gaussians contributes to each frame. Despite this, all Gaussians are processed during rasterization, resulting in redundant computation overhead. To address these redundancies, we present \textbf{4DGS-1K}, which runs at over 1000 FPS on modern GPUs. For Q1, we introduce the Spatial-Temporal Variation Score, a new pruning criterion that effectively removes short-lifespan Gaussians while encouraging 4DGS to capture scene dynamics using Gaussians with longer temporal spans. For Q2, we store a mask for active Gaussians across consecutive frames, significantly reducing redundant computations in rendering. Compared to vanilla 4DGS, our method achieves a $41\times$ reduction in storage and $9\times$ faster rasterization speed on complex dynamic scenes, while maintaining comparable visual quality. Please see our project page at https://4DGS-1K.github.io.

1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

TL;DR

Dynamic scene reconstruction with 4D Gaussian Splatting suffers from excessive storage and slow rendering due to temporal redundancy. The authors propose 4DGS-1K, combining a Spatial-Temporal Variation Score for pruning short-lived Gaussians with a temporal-filtering scheme using shared masks across key-frames to reduce computations. Empirical results on Neural 3D Video and D-NeRF datasets show about 41× storage reduction and 9× faster rasterization, achieving 1000+ FPS with comparable visual quality. This work enables practical real-time rendering of complex dynamic scenes and suggests a pathway toward universal compression for Gaussian-based dynamic representations.

Abstract

4D Gaussian Splatting (4DGS) has recently gained considerable attention as a method for reconstructing dynamic scenes. Despite achieving superior quality, 4DGS typically requires substantial storage and suffers from slow rendering speed. In this work, we delve into these issues and identify two key sources of temporal redundancy. (Q1) \textbf{Short-Lifespan Gaussians}: 4DGS uses a large portion of Gaussians with short temporal span to represent scene dynamics, leading to an excessive number of Gaussians. (Q2) \textbf{Inactive Gaussians}: When rendering, only a small subset of Gaussians contributes to each frame. Despite this, all Gaussians are processed during rasterization, resulting in redundant computation overhead. To address these redundancies, we present \textbf{4DGS-1K}, which runs at over 1000 FPS on modern GPUs. For Q1, we introduce the Spatial-Temporal Variation Score, a new pruning criterion that effectively removes short-lifespan Gaussians while encouraging 4DGS to capture scene dynamics using Gaussians with longer temporal spans. For Q2, we store a mask for active Gaussians across consecutive frames, significantly reducing redundant computations in rendering. Compared to vanilla 4DGS, our method achieves a reduction in storage and faster rasterization speed on complex dynamic scenes, while maintaining comparable visual quality. Please see our project page at https://4DGS-1K.github.io.

Paper Structure

This paper contains 24 sections, 8 equations, 16 figures, 6 tables.

Figures (16)

  • Figure 1: Compressibility and Rendering Speed. We introduce 4DGS-1K, a novel compact representation with high rendering speed. In contrast to 4D Gaussian Splatting (4DGS) yang2023real, we can achieve rasterization at 1000+ FPS while maintaining comparable photorealistic quality with only $2\%$ of the original storage size. The right figure is the result tested on the N3V li2022neural datasets, where the radius of the dot corresponds to the storage size.
  • Figure 2: Temporal redundancy Study. (a) The $\Sigma_t$ distribution of 4DGS. The red line shows the result of vanilla 4DGS. The other two lines represent our model has effectively reduced the number of transient Gaussians with small $\Sigma_t$. (b) The active ratio during rendering at different timestamps. It demonstrates that most of the computation time is spent on inactive Gaussians in vanilla 4DGS. However, 4DGS-1K can significantly reduce the occurrence of inactive Gaussians during rendering to avoid unnecessary computations. (c) This figure shows the IoU between the set of active Gaussians in the first frame and frame t. It proves that active Gaussians tend to overlap significantly across adjacent frames.
  • Figure 3: Visualizations of Distribution of $\Sigma_t$. Most of these Gaussians are concentrated along the edges of moving objects.
  • Figure 4: Overview of 4DGS-1K. (a) We first calculate the spatial-temporal variation score for each 4D Gaussian on training views, to prune Gaussians with short lifespan (The Red Gaussian). (b) The temporal filter is introduced to filter out inactive Gaussians before the rendering process to alleviate suboptimal rendering speed. At a given timestamp $t$, the set of Gaussians participating in rendering is derived from the two adjacent key-frames, $t_0$ and $t_{0+\Delta_t}$.
  • Figure 5: Qualitative comparisons of 4DGS and our method.
  • ...and 11 more figures