Table of Contents
Fetching ...

MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting

Sangwoon Kwak, Joonsoo Kim, Jun Young Jeong, Won-Sik Cheong, Jihyong Oh, Munchurl Kim

TL;DR

MoDec-GS tackles memory efficiency and complex real-world motion in dynamic 3D Gaussian Splatting by introducing Global-to-Local Motion Decomposition (GLMD) and Temporal Interval Adjustment (TIA). Global Anchor Deformation captures broad motion at anchor level, while Local Gaussian Deformation refines local motion, with Local CS built on a Shared hexplane; TIA automatically assigns temporal coverage without optical flow. The approach yields substantial memory savings (average ~70% reduction) while maintaining or improving rendering quality on monocular video datasets. This enables compact, real-time dynamic scene reconstruction from real-world videos.

Abstract

3D Gaussian Splatting (3DGS) has made significant strides in scene representation and neural rendering, with intense efforts focused on adapting it for dynamic scenes. Despite delivering remarkable rendering quality and speed, existing methods struggle with storage demands and representing complex real-world motions. To tackle these issues, we propose MoDecGS, a memory-efficient Gaussian splatting framework designed for reconstructing novel views in challenging scenarios with complex motions. We introduce GlobaltoLocal Motion Decomposition (GLMD) to effectively capture dynamic motions in a coarsetofine manner. This approach leverages Global Canonical Scaffolds (Global CS) and Local Canonical Scaffolds (Local CS), extending static Scaffold representation to dynamic video reconstruction. For Global CS, we propose Global Anchor Deformation (GAD) to efficiently represent global dynamics along complex motions, by directly deforming the implicit Scaffold attributes which are anchor position, offset, and local context features. Next, we finely adjust local motions via the Local Gaussian Deformation (LGD) of Local CS explicitly. Additionally, we introduce Temporal Interval Adjustment (TIA) to automatically control the temporal coverage of each Local CS during training, allowing MoDecGS to find optimal interval assignments based on the specified number of temporal segments. Extensive evaluations demonstrate that MoDecGS achieves an average 70% reduction in model size over stateoftheart methods for dynamic 3D Gaussians from realworld dynamic videos while maintaining or even improving rendering quality.

MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting

TL;DR

MoDec-GS tackles memory efficiency and complex real-world motion in dynamic 3D Gaussian Splatting by introducing Global-to-Local Motion Decomposition (GLMD) and Temporal Interval Adjustment (TIA). Global Anchor Deformation captures broad motion at anchor level, while Local Gaussian Deformation refines local motion, with Local CS built on a Shared hexplane; TIA automatically assigns temporal coverage without optical flow. The approach yields substantial memory savings (average ~70% reduction) while maintaining or improving rendering quality on monocular video datasets. This enables compact, real-time dynamic scene reconstruction from real-world videos.

Abstract

3D Gaussian Splatting (3DGS) has made significant strides in scene representation and neural rendering, with intense efforts focused on adapting it for dynamic scenes. Despite delivering remarkable rendering quality and speed, existing methods struggle with storage demands and representing complex real-world motions. To tackle these issues, we propose MoDecGS, a memory-efficient Gaussian splatting framework designed for reconstructing novel views in challenging scenarios with complex motions. We introduce GlobaltoLocal Motion Decomposition (GLMD) to effectively capture dynamic motions in a coarsetofine manner. This approach leverages Global Canonical Scaffolds (Global CS) and Local Canonical Scaffolds (Local CS), extending static Scaffold representation to dynamic video reconstruction. For Global CS, we propose Global Anchor Deformation (GAD) to efficiently represent global dynamics along complex motions, by directly deforming the implicit Scaffold attributes which are anchor position, offset, and local context features. Next, we finely adjust local motions via the Local Gaussian Deformation (LGD) of Local CS explicitly. Additionally, we introduce Temporal Interval Adjustment (TIA) to automatically control the temporal coverage of each Local CS during training, allowing MoDecGS to find optimal interval assignments based on the specified number of temporal segments. Extensive evaluations demonstrate that MoDecGS achieves an average 70% reduction in model size over stateoftheart methods for dynamic 3D Gaussians from realworld dynamic videos while maintaining or even improving rendering quality.
Paper Structure (34 sections, 13 equations, 10 figures, 9 tables, 1 algorithm)

This paper contains 34 sections, 13 equations, 10 figures, 9 tables, 1 algorithm.

Figures (10)

  • Figure 1: Novel view synthesis results onPark2021HyperNeRF. We introduce MoDec-GS, a novel framework for learning compact dynamic 3D Gaussians from real-world videos with complex motion. While existing SOTA methods Yang2024Deformable3DGSHuang2024SCGSWu20244DGS have difficulty modeling such complex combination of global and local motions, our approach effectively handles them thanks to GLMD (Sec. \ref{['sect:method overview']}), and outperforms the prior methods in rendering quality even with a compact model size. The metrics under each framework are, PSNR (dB)$\uparrow$ / LPIPS zhang2018unreasonable$\downarrow$ / Storage (MB)$\downarrow$.
  • Figure 2: Overview of our MoDec-GS framework. To effectively train dynamic 3D Gaussians with complex motion, we introduce Global-to-Local Motion Decomposition (GLMD) (Sec \ref{['sect:method overview']}). We first train a Global Canonical Scaffold-GS (Global CS) with entire frames, and apply a Global Anchor Deformation (GAD) to Local Canonical Scaffold-GS (Local CS) dedicated to represent its corresponding temporal segment (Sec \ref{['sect4.2']}). Next, to finely adjust the remaining local motion, we apply Local Gaussian Deformation (LGD) which explicitly deforms the reconstructed 3D Gaussians with a shared hexplane (Sec \ref{['sect4.3']}). During the training, Temporal Interval Adjustment (TIA) is performed, optimizing the temporal interval into a non-uniform interval that adopts to the scene's level of motion (Sec \ref{['sect4.4']}).
  • Figure 3: Concept and effect of 2-stage deformation. For representing a complex motion of 3D Gaussians, a global movement over time intervals can be more efficiently handled through deformation of anchor itself. In contrast, subtle motions of individual 3D Gaussians within a time interval can be effectively addressed by explicit deformation of each Gaussian.
  • Figure 4: Qualitative results comparison on three datasets Gao2022DycheckPark2021HyperNeRFyoon2020novel. The yellow boxes highlight areas where the proposed method achieves notable visual quality improvements, and the storage for the corresponding sequence is displayed below each rendered patch.
  • Figure 5: Effectiveness of TIA. Initially uniform intervals (black dotted lines) are adaptively reallocated based on motion complexity (blue lines), as indicated by normalized optical flow magnitude.
  • ...and 5 more figures