Table of Contents
Fetching ...

UPGS: Unified Pose-aware Gaussian Splatting for Dynamic Scene Deblurring

Zhijing Wu, Longguang Wang

Abstract

Reconstructing dynamic 3D scenes from monocular video has broad applications in AR/VR, robotics, and autonomous navigation, but often fails due to severe motion blur caused by camera and object motion. Existing methods commonly follow a two-step pipeline, where camera poses are first estimated and then 3D Gaussians are optimized. Since blurring artifacts usually undermine pose estimation, pose errors could be accumulated to produce inferior reconstruction results. To address this issue, we introduce a unified optimization framework by incorporating camera poses as learnable parameters complementary to 3DGS attributes for end-to-end optimization. Specifically, we recast camera and object motion as per-primitive SE(3) affine transformations on 3D Gaussians and formulate a unified optimization objective. For stable optimization, we introduce a three-stage training schedule that optimizes camera poses and Gaussians alternatively. Particularly, 3D Gaussians are first trained with poses being fixed, and then poses are optimized with 3D Gaussians being untouched. Finally, all learnable parameters are optimized together. Extensive experiments on the Stereo Blur dataset and challenging real-world sequences demonstrate that our method achieves significant gains in reconstruction quality and pose estimation accuracy over prior dynamic deblurring methods.

UPGS: Unified Pose-aware Gaussian Splatting for Dynamic Scene Deblurring

Abstract

Reconstructing dynamic 3D scenes from monocular video has broad applications in AR/VR, robotics, and autonomous navigation, but often fails due to severe motion blur caused by camera and object motion. Existing methods commonly follow a two-step pipeline, where camera poses are first estimated and then 3D Gaussians are optimized. Since blurring artifacts usually undermine pose estimation, pose errors could be accumulated to produce inferior reconstruction results. To address this issue, we introduce a unified optimization framework by incorporating camera poses as learnable parameters complementary to 3DGS attributes for end-to-end optimization. Specifically, we recast camera and object motion as per-primitive SE(3) affine transformations on 3D Gaussians and formulate a unified optimization objective. For stable optimization, we introduce a three-stage training schedule that optimizes camera poses and Gaussians alternatively. Particularly, 3D Gaussians are first trained with poses being fixed, and then poses are optimized with 3D Gaussians being untouched. Finally, all learnable parameters are optimized together. Extensive experiments on the Stereo Blur dataset and challenging real-world sequences demonstrate that our method achieves significant gains in reconstruction quality and pose estimation accuracy over prior dynamic deblurring methods.

Paper Structure

This paper contains 26 sections, 19 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: An overview of UPGS. We adopt a three-stage training schedule for optimization. Camera motion is represented as trainable SE(3) affine transformations (Sec 3.2) on Gaussian primitives, thereby camera poses can be optimized together with the reconstructed scene. Using COLMAP poses for initialization, we first optimize Gaussian primitives with poses fixed. Next, with Gaussian primitives being frozen, we refine only the affine transformations. In the final stage, we jointly fine-tune scene and pose so they co-adapt (Sec 3.3), yielding sharper renders, higher reconstruction fidelity, and more accurate camera trajectories.
  • Figure 2: Visual Comparison on Stereo Blur and BARD-GS Dataset. The orange boxes highlights regions with intense dynamic motion and the blue boxes indicate purely static areas.
  • Figure 3: Visual Comparison of Ablation Studies on a) Affine warp representation b) Stagewise optimization strategy
  • Figure 4: Ablation Study: Effect of affine transformations on geometric modeling.
  • Figure 5: Ablation Studies: Impact of our optimization strategy under large camera motion.
  • ...and 1 more figures