GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance and Monocular Cues for Indoor Scene Reconstruction
Haodong Xiang, Xinghui Li, Kai Cheng, Xiansong Lai, Wanting Zhang, Zhichao Liao, Long Zeng, Xueping Liu
TL;DR
GaussianRoom addresses indoor scene reconstruction by integrating neural SDF with 3D Gaussian Splatting to produce high-quality geometry and fast rendering. It introduces an SDF-guided primitive distribution to densify/prune Gaussians and a Gaussian-guided sampling approach to accelerate neural SDF training, complemented by monocular normal and edge priors to handle textureless regions and details. Extensive experiments on ScanNet and ScanNet++ show state-of-the-art performance in surface reconstruction and novel view synthesis while preserving rendering efficiency. The approach demonstrates a positive mutual learning loop between the SDF and Gaussian representations, enabling robust indoor scene reconstruction with efficient rendering.
Abstract
Embodied intelligence requires precise reconstruction and rendering to simulate large-scale real-world data. Although 3D Gaussian Splatting (3DGS) has recently demonstrated high-quality results with real-time performance, it still faces challenges in indoor scenes with large, textureless regions, resulting in incomplete and noisy reconstructions due to poor point cloud initialization and underconstrained optimization. Inspired by the continuity of signed distance field (SDF), which naturally has advantages in modeling surfaces, we propose a unified optimization framework that integrates neural signed distance fields (SDFs) with 3DGS for accurate geometry reconstruction and real-time rendering. This framework incorporates a neural SDF field to guide the densification and pruning of Gaussians, enabling Gaussians to model scenes accurately even with poor initialized point clouds. Simultaneously, the geometry represented by Gaussians improves the efficiency of the SDF field by piloting its point sampling. Additionally, we introduce two regularization terms based on normal and edge priors to resolve geometric ambiguities in textureless areas and enhance detail accuracy. Extensive experiments in ScanNet and ScanNet++ show that our method achieves state-of-the-art performance in both surface reconstruction and novel view synthesis.
