Table of Contents
Fetching ...

Sparse4DGS: 4D Gaussian Splatting for Sparse-Frame Dynamic Scene Reconstruction

Changyue Shi, Chuxiao Yang, Xinyuan Hu, Minghao Chen, Wenwen Pan, Yan Yang, Jiajun Ding, Zhou Yu, Jun Yu

TL;DR

Sparse4DGS tackles the challenge of reconstructing dynamic 4D scenes from sparse input frames by injecting texture richness cues into 3D Gaussians. It introduces a Texture Intensity ($TI$) Gaussian Field and two texture-aware mechanisms: Texture-Aware Deformation Regularization (TADR) and Texture-Aware Canonical Optimization (TACO), the latter leveraging SGLD-inspired updates to bias Gaussians toward texture-rich regions. Across NeRF-Synthetic, NeRF-DS, HyperNeRF, and the iPhone-4D dataset, Sparse4DGS outperforms prior dynamic and few-shot methods, especially at low frame rates, while preserving fine structural details. The approach enables photorealistic 4D reconstructions from sparse frames, broadening the practical applicability of Gaussian Splatting for real-world dynamic scenes.

Abstract

Dynamic Gaussian Splatting approaches have achieved remarkable performance for 4D scene reconstruction. However, these approaches rely on dense-frame video sequences for photorealistic reconstruction. In real-world scenarios, due to equipment constraints, sometimes only sparse frames are accessible. In this paper, we propose Sparse4DGS, the first method for sparse-frame dynamic scene reconstruction. We observe that dynamic reconstruction methods fail in both canonical and deformed spaces under sparse-frame settings, especially in areas with high texture richness. Sparse4DGS tackles this challenge by focusing on texture-rich areas. For the deformation network, we propose Texture-Aware Deformation Regularization, which introduces a texture-based depth alignment loss to regulate Gaussian deformation. For the canonical Gaussian field, we introduce Texture-Aware Canonical Optimization, which incorporates texture-based noise into the gradient descent process of canonical Gaussians. Extensive experiments show that when taking sparse frames as inputs, our method outperforms existing dynamic or few-shot techniques on NeRF-Synthetic, HyperNeRF, NeRF-DS, and our iPhone-4D datasets.

Sparse4DGS: 4D Gaussian Splatting for Sparse-Frame Dynamic Scene Reconstruction

TL;DR

Sparse4DGS tackles the challenge of reconstructing dynamic 4D scenes from sparse input frames by injecting texture richness cues into 3D Gaussians. It introduces a Texture Intensity () Gaussian Field and two texture-aware mechanisms: Texture-Aware Deformation Regularization (TADR) and Texture-Aware Canonical Optimization (TACO), the latter leveraging SGLD-inspired updates to bias Gaussians toward texture-rich regions. Across NeRF-Synthetic, NeRF-DS, HyperNeRF, and the iPhone-4D dataset, Sparse4DGS outperforms prior dynamic and few-shot methods, especially at low frame rates, while preserving fine structural details. The approach enables photorealistic 4D reconstructions from sparse frames, broadening the practical applicability of Gaussian Splatting for real-world dynamic scenes.

Abstract

Dynamic Gaussian Splatting approaches have achieved remarkable performance for 4D scene reconstruction. However, these approaches rely on dense-frame video sequences for photorealistic reconstruction. In real-world scenarios, due to equipment constraints, sometimes only sparse frames are accessible. In this paper, we propose Sparse4DGS, the first method for sparse-frame dynamic scene reconstruction. We observe that dynamic reconstruction methods fail in both canonical and deformed spaces under sparse-frame settings, especially in areas with high texture richness. Sparse4DGS tackles this challenge by focusing on texture-rich areas. For the deformation network, we propose Texture-Aware Deformation Regularization, which introduces a texture-based depth alignment loss to regulate Gaussian deformation. For the canonical Gaussian field, we introduce Texture-Aware Canonical Optimization, which incorporates texture-based noise into the gradient descent process of canonical Gaussians. Extensive experiments show that when taking sparse frames as inputs, our method outperforms existing dynamic or few-shot techniques on NeRF-Synthetic, HyperNeRF, NeRF-DS, and our iPhone-4D datasets.

Paper Structure

This paper contains 15 sections, 14 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: In this work, we introduce Sparse4DGS, a novel approach for dynamic scene reconstruction using sparse input frames. In the "Sheet" scene from the NeRF-DS yan2023nerf dataset, when taking sparse frames as inputs, Sparse4DGS achieves high-quality novel view synthesis results in both canonical and deformed spaces.
  • Figure 2: Overall pipeline of Sparse4DGS.Left: The Sobel operator and Mono-Depth Estimator ranftl2021vision are employed to generate the texture intensity (TI) and depth maps from sparse frames. Top right: The $TI$ attribute is embedded in each Gaussian via $L_{tex}$. Texture-Aware Deformation Regularization is employed to align the rendered and ground truth texture intensity of depth maps with $L_{tadr}$. Bottom right: After receiving the original gradient, Texture-Aware Canonical Optimization introduces an additional texture-based noise to each Gaussian, thereby improving their concentration on texture-rich regions.
  • Figure 3: Visualization of texture intensity maps.Left: Input RGB images. Right: Extracted texture intensity maps.
  • Figure 4: Comparison between L1 distance and PCC for the rendered texture map.${L}_{tex}$ with PCC achieves more precise texture embedding results.
  • Figure 5: Visualization of various methods on our iPhone-4D dataset with 5 FPS video inputs.
  • ...and 4 more figures