Table of Contents
Fetching ...

DENSER: 3D Gaussians Splatting for Scene Reconstruction of Dynamic Urban Environments

Mahmud A. Mohamad, Gamal Elghazaly, Arthur Hubert, Raphael Frank

TL;DR

Instead of directly using Spherical Harmonics (SH) to model the appearance of dynamic objects, a new method aiming at dynamically estimating SH bases using wavelets is introduced, resulting in better representation of dynamic objects appearance in both space and time.

Abstract

This paper presents DENSER, an efficient and effective approach leveraging 3D Gaussian splatting (3DGS) for the reconstruction of dynamic urban environments. While several methods for photorealistic scene representations, both implicitly using neural radiance fields (NeRF) and explicitly using 3DGS have shown promising results in scene reconstruction of relatively complex dynamic scenes, modeling the dynamic appearance of foreground objects tend to be challenging, limiting the applicability of these methods to capture subtleties and details of the scenes, especially far dynamic objects. To this end, we propose DENSER, a framework that significantly enhances the representation of dynamic objects and accurately models the appearance of dynamic objects in the driving scene. Instead of directly using Spherical Harmonics (SH) to model the appearance of dynamic objects, we introduce and integrate a new method aiming at dynamically estimating SH bases using wavelets, resulting in better representation of dynamic objects appearance in both space and time. Besides object appearance, DENSER enhances object shape representation through densification of its point cloud across multiple scene frames, resulting in faster convergence of model training. Extensive evaluations on KITTI dataset show that the proposed approach significantly outperforms state-of-the-art methods by a wide margin. Source codes and models will be uploaded to this repository https://github.com/sntubix/denser

DENSER: 3D Gaussians Splatting for Scene Reconstruction of Dynamic Urban Environments

TL;DR

Instead of directly using Spherical Harmonics (SH) to model the appearance of dynamic objects, a new method aiming at dynamically estimating SH bases using wavelets is introduced, resulting in better representation of dynamic objects appearance in both space and time.

Abstract

This paper presents DENSER, an efficient and effective approach leveraging 3D Gaussian splatting (3DGS) for the reconstruction of dynamic urban environments. While several methods for photorealistic scene representations, both implicitly using neural radiance fields (NeRF) and explicitly using 3DGS have shown promising results in scene reconstruction of relatively complex dynamic scenes, modeling the dynamic appearance of foreground objects tend to be challenging, limiting the applicability of these methods to capture subtleties and details of the scenes, especially far dynamic objects. To this end, we propose DENSER, a framework that significantly enhances the representation of dynamic objects and accurately models the appearance of dynamic objects in the driving scene. Instead of directly using Spherical Harmonics (SH) to model the appearance of dynamic objects, we introduce and integrate a new method aiming at dynamically estimating SH bases using wavelets, resulting in better representation of dynamic objects appearance in both space and time. Besides object appearance, DENSER enhances object shape representation through densification of its point cloud across multiple scene frames, resulting in faster convergence of model training. Extensive evaluations on KITTI dataset show that the proposed approach significantly outperforms state-of-the-art methods by a wide margin. Source codes and models will be uploaded to this repository https://github.com/sntubix/denser
Paper Structure (18 sections, 9 equations, 8 figures, 1 table)

This paper contains 18 sections, 9 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Scene decomposition using DENSER into static background and dynamic objects and reconstruction (a) Ground truth (b) scene decomposition: static background (c) scene decomposition: dynamic objects (d) scene reconstruction
  • Figure 2: DENSER Scene Composition Framework. The pipeline starts by processing raw sensor data to get a set of densified point cloud for each foreground object in its reference frame and for the static background. Object point clouds are used to initialize 3D Gaussians of dynamic objects for which wavelets are used to estimate their color appearance. Background point cloud initializes the 3D Gaussians of the static with appearance modelled using a traditional SH basis. All 3D Gaussians form a scene graph which can jointly rendered for a novel view.
  • Figure 3: Qualitative image reconstruction comparison on KITTI dataset geiger2012we.
  • Figure 4: Ablation: Impact of the dimension of wavelet basis on the performance of scene reconstruction
  • Figure 5: Object Removal: The top row shows the GT while the bottom row displays the modified scenes where the bus have been removed.
  • ...and 3 more figures