Table of Contents
Fetching ...

Relaxed Rigidity with Ray-based Grouping for Dynamic Gaussian Splatting

Junoh Lee, Junmyeong Lee, Yeon-Ji Song, Inhwan Bae, Jisu Shin, Hae-Gon Jeon, Jin-Hwa Kim

Abstract

The reconstruction of dynamic 3D scenes using 3D Gaussian Splatting has shown significant promise. A key challenge, however, remains in modeling realistic motion, as most methods fail to align the motion of Gaussians with real-world physical dynamics. This misalignment is particularly problematic for monocular video datasets, where failing to maintain coherent motion undermines local geometric structure, ultimately leading to degraded reconstruction quality. Consequently, many state-of-the-art approaches rely heavily on external priors, such as optical flow or 2D tracks, to enforce temporal coherence. In this work, we propose a novel method to explicitly preserve the local geometric structure of Gaussians across time in 4D scenes. Our core idea is to introduce a view-space ray grouping strategy that clusters Gaussians intersected by the same ray, considering only those whose $α$-blending weights exceed a threshold. We then apply constraints to these groups to maintain a consistent spatial distribution, effectively preserving their local geometry. This approach enforces a more physically plausible motion model by ensuring that local geometry remains stable over time, eliminating the reliance on external guidance. We demonstrate the efficacy of our method by integrating it into two distinct baseline models. Extensive experiments on challenging monocular datasets show that our approach significantly outperforms existing methods, achieving superior temporal consistency and reconstruction quality.

Relaxed Rigidity with Ray-based Grouping for Dynamic Gaussian Splatting

Abstract

The reconstruction of dynamic 3D scenes using 3D Gaussian Splatting has shown significant promise. A key challenge, however, remains in modeling realistic motion, as most methods fail to align the motion of Gaussians with real-world physical dynamics. This misalignment is particularly problematic for monocular video datasets, where failing to maintain coherent motion undermines local geometric structure, ultimately leading to degraded reconstruction quality. Consequently, many state-of-the-art approaches rely heavily on external priors, such as optical flow or 2D tracks, to enforce temporal coherence. In this work, we propose a novel method to explicitly preserve the local geometric structure of Gaussians across time in 4D scenes. Our core idea is to introduce a view-space ray grouping strategy that clusters Gaussians intersected by the same ray, considering only those whose -blending weights exceed a threshold. We then apply constraints to these groups to maintain a consistent spatial distribution, effectively preserving their local geometry. This approach enforces a more physically plausible motion model by ensuring that local geometry remains stable over time, eliminating the reliance on external guidance. We demonstrate the efficacy of our method by integrating it into two distinct baseline models. Extensive experiments on challenging monocular datasets show that our approach significantly outperforms existing methods, achieving superior temporal consistency and reconstruction quality.

Paper Structure

This paper contains 31 sections, 29 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Overview. Before regularization (first), individual Gaussians exhibit temporally inconsistent motion; each colored trajectory denotes the time-varying movement of a single Gaussian. Our method first performs ray-based grouping to cluster spatially adjacent Gaussians along view rays (second), with a zoomed-in example shown in the third panel. Motion and spectral regularization are then applied within each group to enforce coherent dynamics. After regularization (fourth), the Gaussian trajectories become temporally aligned and physically consistent.
  • Figure 2: Ray-based Grouping visualization. We gather the Gaussian penetrated by a ray, and select the Gaussian whose contribution $w_i$ exceeds a threshold $\tau$.
  • Figure 3: Group size visualization. Distribution of Gaussian group sizes for the D-NeRF (left) and HyperNeRF (right). The plot shows the counts of groups with varying sizes across different image frames. Groups of size zero are omitted.
  • Figure 4: Qualitative comparisons between baselines and our method on the D-NeRF and HyperNeRF datasets.
  • Figure 5: Qualitative comparisons between baselines and our method on the NeRF-DS dataset.
  • ...and 2 more figures