Table of Contents
Fetching ...

EGG-Fusion: Efficient 3D Reconstruction with Geometry-aware Gaussian Surfel on the Fly

Xiaokun Pan, Zhenzhe Li, Zhichao Ye, Hongjia Zhai, Guofeng Zhang

TL;DR

EGG-Fusion tackles real-time 3D reconstruction under sensor noise by introducing Gaussian surfels with an information-filter fusion mechanism and geometry-aware surfel initialization. The framework combines robust sparse-to-dense camera tracking with differentiable surfel optimization to maintain high-fidelity geometry while achieving 24 FPS. It demonstrates state-of-the-art accuracy on Replica and ScanNet++ benchmarks and delivers high-quality novel view renderings. The work advances practical real-time differentiable rendering SLAM by improving stability, efficiency, and surface confidence through explicit surfel level uncertainty handling and regularization.

Abstract

Real-time 3D reconstruction is a fundamental task in computer graphics. Recently, differentiable-rendering-based SLAM system has demonstrated significant potential, enabling photorealistic scene rendering through learnable scene representations such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). Current differentiable rendering methods face dual challenges in real-time computation and sensor noise sensitivity, leading to degraded geometric fidelity in scene reconstruction and limited practicality. To address these challenges, we propose a novel real-time system EGG-Fusion, featuring robust sparse-to-dense camera tracking and a geometry-aware Gaussian surfel mapping module, introducing an information filter-based fusion method that explicitly accounts for sensor noise to achieve high-precision surface reconstruction. The proposed differentiable Gaussian surfel mapping effectively models multi-view consistent surfaces while enabling efficient parameter optimization. Extensive experimental results demonstrate that the proposed system achieves a surface reconstruction error of 0.6\textit{cm} on standardized benchmark datasets including Replica and ScanNet++, representing over 20\% improvement in accuracy compared to state-of-the-art (SOTA) GS-based methods. Notably, the system maintains real-time processing capabilities at 24 FPS, establishing it as one of the most accurate differentiable-rendering-based real-time reconstruction systems. Project Page: https://zju3dv.github.io/eggfusion/

EGG-Fusion: Efficient 3D Reconstruction with Geometry-aware Gaussian Surfel on the Fly

TL;DR

EGG-Fusion tackles real-time 3D reconstruction under sensor noise by introducing Gaussian surfels with an information-filter fusion mechanism and geometry-aware surfel initialization. The framework combines robust sparse-to-dense camera tracking with differentiable surfel optimization to maintain high-fidelity geometry while achieving 24 FPS. It demonstrates state-of-the-art accuracy on Replica and ScanNet++ benchmarks and delivers high-quality novel view renderings. The work advances practical real-time differentiable rendering SLAM by improving stability, efficiency, and surface confidence through explicit surfel level uncertainty handling and regularization.

Abstract

Real-time 3D reconstruction is a fundamental task in computer graphics. Recently, differentiable-rendering-based SLAM system has demonstrated significant potential, enabling photorealistic scene rendering through learnable scene representations such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). Current differentiable rendering methods face dual challenges in real-time computation and sensor noise sensitivity, leading to degraded geometric fidelity in scene reconstruction and limited practicality. To address these challenges, we propose a novel real-time system EGG-Fusion, featuring robust sparse-to-dense camera tracking and a geometry-aware Gaussian surfel mapping module, introducing an information filter-based fusion method that explicitly accounts for sensor noise to achieve high-precision surface reconstruction. The proposed differentiable Gaussian surfel mapping effectively models multi-view consistent surfaces while enabling efficient parameter optimization. Extensive experimental results demonstrate that the proposed system achieves a surface reconstruction error of 0.6\textit{cm} on standardized benchmark datasets including Replica and ScanNet++, representing over 20\% improvement in accuracy compared to state-of-the-art (SOTA) GS-based methods. Notably, the system maintains real-time processing capabilities at 24 FPS, establishing it as one of the most accurate differentiable-rendering-based real-time reconstruction systems. Project Page: https://zju3dv.github.io/eggfusion/

Paper Structure

This paper contains 32 sections, 21 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Framework of EGG-Fusion. Our framework is divided into two integral components. In the scene mapping module (Sec. \ref{['sec:scene_mapping']}), Gaussian surfels are utilized as the fundamental primitives for scene representation and can achieve high-quality real-time reconstruction The camera tracking module (Sec. \ref{['sec:pose_estim']}) employs a sparse-to-dense strategy to ensure robust estimation of camera poses.
  • Figure 2: Gaussian surfels fusion. We ensure that surfels can be explicitly and continuously updated with new observations, enabling them to adhere to the scene surface (left), while new observations are utilized to update normal information (right), thereby achieving more accurate surface reconstruction.
  • Figure 3: TSDF-based Reconstruction Result on Replica and ScanNet++. In terms of scene reconstruction mesh details on Replica straubReplicaDatasetDigital2019 and ScanNet++ scannet++, we outperform other methods with the overall quality and detail accuracy of the reconstructed mesh.
  • Figure 4: Geometry accuracy of points. Our method achieved globally high-precision reconstruction. In contrast, other methods exhibited higher errors in either the details of local geometric complexity (RTG-SLAM rtg_slam) or the overall geometric structure (SplaTAM splatam and ElasticFusion whelanElasticFusionDenseSLAM2015c).
  • Figure 5: Rendering Results on Replica and ScanNet++. We present a comparison of novel view synthesis results on Replica straubReplicaDatasetDigital2019 and ScanNet++ scannet++. Our method demonstrates superior rendering details in both training views (Replica, left) and testing views (ScanNet++, right).
  • ...and 9 more figures