Event-boosted Deformable 3D Gaussians for Dynamic Scene Reconstruction
Wenhao Xu, Wenming Weng, Yueyi Zhang, Ruikang Xu, Zhiwei Xiong
TL;DR
This paper tackles the challenge of reconstructing dynamic scenes from data with both high temporal resolution and motion blur by combining event cameras with deformable 3D-Gaussian Splatting (3D-GS). It introduces GS-threshold Joint Modeling (GTJM) to jointly estimate event thresholds and geometry using RGB cues and pseudo-intermediate frames, and Dynamic-Static Decomposition (DSD) to isolate dynamic regions for targeted deformation, plus a one-minute 3D-to-2D decomposition. The method is evaluated on a new event-inclusive 4D benchmark with synthetic and real scenes, achieving state-of-the-art PSNR/SSIM/LPIPS and faster rendering compared with RGB-only and prior event-based baselines (e.g., gains of several dB in PSNR). These contributions enable more faithful and efficient dynamic scene capture for AR/VR, robotics, and animation.
Abstract
Deformable 3D Gaussian Splatting (3D-GS) is limited by missing intermediate motion information due to the low temporal resolution of RGB cameras. To address this, we introduce the first approach combining event cameras, which capture high-temporal-resolution, continuous motion data, with deformable 3D-GS for dynamic scene reconstruction. We observe that threshold modeling for events plays a crucial role in achieving high-quality reconstruction. Therefore, we propose a GS-Threshold Joint Modeling strategy, creating a mutually reinforcing process that greatly improves both 3D reconstruction and threshold modeling. Moreover, we introduce a Dynamic-Static Decomposition strategy that first identifies dynamic areas by exploiting the inability of static Gaussians to represent motions, then applies a buffer-based soft decomposition to separate dynamic and static areas. This strategy accelerates rendering by avoiding unnecessary deformation in static areas, and focuses on dynamic areas to enhance fidelity. Additionally, we contribute the first event-inclusive 4D benchmark with synthetic and real-world dynamic scenes, on which our method achieves state-of-the-art performance.
