Table of Contents
Fetching ...

GSFusion: Online RGB-D Mapping Where Gaussian Splatting Meets TSDF Fusion

Jiaxin Wei, Stefan Leutenegger

TL;DR

This work incorporates 3D Gaussian into a volumetric mapping system to take advantage of geometric information and proposes to use a quadtree data structure on images to drastically reduce the number of splats initialized, generating a compact 3D Gaussian map with fewer artifacts and a volumetric map on the fly.

Abstract

Traditional volumetric fusion algorithms preserve the spatial structure of 3D scenes, which is beneficial for many tasks in computer vision and robotics. However, they often lack realism in terms of visualization. Emerging 3D Gaussian splatting bridges this gap, but existing Gaussian-based reconstruction methods often suffer from artifacts and inconsistencies with the underlying 3D structure, and struggle with real-time optimization, unable to provide users with immediate feedback in high quality. One of the bottlenecks arises from the massive amount of Gaussian parameters that need to be updated during optimization. Instead of using 3D Gaussian as a standalone map representation, we incorporate it into a volumetric mapping system to take advantage of geometric information and propose to use a quadtree data structure on images to drastically reduce the number of splats initialized. In this way, we simultaneously generate a compact 3D Gaussian map with fewer artifacts and a volumetric map on the fly. Our method, GSFusion, significantly enhances computational efficiency without sacrificing rendering quality, as demonstrated on both synthetic and real datasets. Code will be available at https://github.com/goldoak/GSFusion.

GSFusion: Online RGB-D Mapping Where Gaussian Splatting Meets TSDF Fusion

TL;DR

This work incorporates 3D Gaussian into a volumetric mapping system to take advantage of geometric information and proposes to use a quadtree data structure on images to drastically reduce the number of splats initialized, generating a compact 3D Gaussian map with fewer artifacts and a volumetric map on the fly.

Abstract

Traditional volumetric fusion algorithms preserve the spatial structure of 3D scenes, which is beneficial for many tasks in computer vision and robotics. However, they often lack realism in terms of visualization. Emerging 3D Gaussian splatting bridges this gap, but existing Gaussian-based reconstruction methods often suffer from artifacts and inconsistencies with the underlying 3D structure, and struggle with real-time optimization, unable to provide users with immediate feedback in high quality. One of the bottlenecks arises from the massive amount of Gaussian parameters that need to be updated during optimization. Instead of using 3D Gaussian as a standalone map representation, we incorporate it into a volumetric mapping system to take advantage of geometric information and propose to use a quadtree data structure on images to drastically reduce the number of splats initialized. In this way, we simultaneously generate a compact 3D Gaussian map with fewer artifacts and a volumetric map on the fly. Our method, GSFusion, significantly enhances computational efficiency without sacrificing rendering quality, as demonstrated on both synthetic and real datasets. Code will be available at https://github.com/goldoak/GSFusion.
Paper Structure (27 sections, 7 equations, 5 figures, 8 tables)

This paper contains 27 sections, 7 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Comparison of methods on a real scene (8b5caf3398) from ScanNet++ dataset scannetpp. All the reported results are obtained from a single Nvidia RTX 3060 GPU.
  • Figure 2: System overview of our proposed GSFusion. At each time step, it takes a pair of RGB-D images as input. The depth data is fused into an octree-based TSDF grid to capture geometric structure while the RGB image is segmented using quadtree based on contrast. A new 3D Gaussian is then initialized at the back-projected center of a quadrant if there are no adjacent Gaussians by checking its nearest voxel. We optimize Gaussian parameters on the fly by minimizing the photometric loss between the rendered image and input RGB. Additionally, we maintain a keyframe set to deal with the forgetting problem. After scanning, the system provides both a volumetric map and a 3D Gaussian map for subsequent tasks.
  • Figure 3: Effect of different quadtree thresholds. Top: input RGB image (left) and image rendered from the map created using 0.1 quadtree threshold (right). Bottom: RGB images segmented with different quadtree thresholds. Using stricter quadtree thresholds can help capture finer details, particularly thin edges caused by contrasts.
  • Figure 4: Qualitative rendering results from training and novel views on the ScanNet++ dataset. Zoom in for a clearer view.
  • Figure 5: Workflow of our GSFusion applied to self-collected real-world drone data.