Table of Contents
Fetching ...

GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance and Monocular Cues for Indoor Scene Reconstruction

Haodong Xiang, Xinghui Li, Kai Cheng, Xiansong Lai, Wanting Zhang, Zhichao Liao, Long Zeng, Xueping Liu

TL;DR

GaussianRoom addresses indoor scene reconstruction by integrating neural SDF with 3D Gaussian Splatting to produce high-quality geometry and fast rendering. It introduces an SDF-guided primitive distribution to densify/prune Gaussians and a Gaussian-guided sampling approach to accelerate neural SDF training, complemented by monocular normal and edge priors to handle textureless regions and details. Extensive experiments on ScanNet and ScanNet++ show state-of-the-art performance in surface reconstruction and novel view synthesis while preserving rendering efficiency. The approach demonstrates a positive mutual learning loop between the SDF and Gaussian representations, enabling robust indoor scene reconstruction with efficient rendering.

Abstract

Embodied intelligence requires precise reconstruction and rendering to simulate large-scale real-world data. Although 3D Gaussian Splatting (3DGS) has recently demonstrated high-quality results with real-time performance, it still faces challenges in indoor scenes with large, textureless regions, resulting in incomplete and noisy reconstructions due to poor point cloud initialization and underconstrained optimization. Inspired by the continuity of signed distance field (SDF), which naturally has advantages in modeling surfaces, we propose a unified optimization framework that integrates neural signed distance fields (SDFs) with 3DGS for accurate geometry reconstruction and real-time rendering. This framework incorporates a neural SDF field to guide the densification and pruning of Gaussians, enabling Gaussians to model scenes accurately even with poor initialized point clouds. Simultaneously, the geometry represented by Gaussians improves the efficiency of the SDF field by piloting its point sampling. Additionally, we introduce two regularization terms based on normal and edge priors to resolve geometric ambiguities in textureless areas and enhance detail accuracy. Extensive experiments in ScanNet and ScanNet++ show that our method achieves state-of-the-art performance in both surface reconstruction and novel view synthesis.

GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance and Monocular Cues for Indoor Scene Reconstruction

TL;DR

GaussianRoom addresses indoor scene reconstruction by integrating neural SDF with 3D Gaussian Splatting to produce high-quality geometry and fast rendering. It introduces an SDF-guided primitive distribution to densify/prune Gaussians and a Gaussian-guided sampling approach to accelerate neural SDF training, complemented by monocular normal and edge priors to handle textureless regions and details. Extensive experiments on ScanNet and ScanNet++ show state-of-the-art performance in surface reconstruction and novel view synthesis while preserving rendering efficiency. The approach demonstrates a positive mutual learning loop between the SDF and Gaussian representations, enabling robust indoor scene reconstruction with efficient rendering.

Abstract

Embodied intelligence requires precise reconstruction and rendering to simulate large-scale real-world data. Although 3D Gaussian Splatting (3DGS) has recently demonstrated high-quality results with real-time performance, it still faces challenges in indoor scenes with large, textureless regions, resulting in incomplete and noisy reconstructions due to poor point cloud initialization and underconstrained optimization. Inspired by the continuity of signed distance field (SDF), which naturally has advantages in modeling surfaces, we propose a unified optimization framework that integrates neural signed distance fields (SDFs) with 3DGS for accurate geometry reconstruction and real-time rendering. This framework incorporates a neural SDF field to guide the densification and pruning of Gaussians, enabling Gaussians to model scenes accurately even with poor initialized point clouds. Simultaneously, the geometry represented by Gaussians improves the efficiency of the SDF field by piloting its point sampling. Additionally, we introduce two regularization terms based on normal and edge priors to resolve geometric ambiguities in textureless areas and enhance detail accuracy. Extensive experiments in ScanNet and ScanNet++ show that our method achieves state-of-the-art performance in both surface reconstruction and novel view synthesis.
Paper Structure (18 sections, 13 equations, 5 figures, 3 tables)

This paper contains 18 sections, 13 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Overview. GaussianRoom integrates neural SDF within 3DGS and forms a positive cycle improving each other. (a) We employ the geometric information from the SDF to constrain the Gaussian primitives, ensuring their spatial distribution aligns with the scene surface. (b) We utilize rasterized depth from Gaussian to efficiently provide coarse geometry information, narrowing down the sampling range to accelerate the optimization of neural SDF. (c) We introduce monocular normal prior and edge prior, addressing the challenges of texture-less areas and fine structures indoors.
  • Figure 2: (a) Gaussian primitives distribution (b) Ground truth scene surface and Gaussian primitives distribution
  • Figure 3: The red Gaussian points represent new Gaussians generated by the SDF-guided Global Densification strategy, while the green Gaussian points indicate those adjusted through the SDF-guided Densification and Pruning process.
  • Figure 4: Qualitative reconstruction comparisons. For each indoor scene, the first row is the top view of the whole room, and the second row is the details of the masked region. The reconstruction results of GaussianRoom visually have better scene integrity than other methods, especially in details.
  • Figure 5: Qualitative rendering comparisons. As shown from the above-highlighted patches, the rendering results of GaussianRoom outperform other GS-based methods, including texture-less regions and details.