Sparis: Neural Implicit Surface Reconstruction of Indoor Scenes from Sparse Views
Yulun Wu, Han Huang, Wenyuan Zhang, Chao Deng, Ge Gao, Ming Gu, Yu-Shen Liu
TL;DR
Sparsely sampled indoor scenes challenge traditional geometry reconstruction, especially for monocular priors. Sparis advances this field by integrating a VolSDF-based neural implicit surface with inter-image depth priors derived from 2D feature matching, augmented by cross-view reprojection and matching-optimization strategies (angular filter and epipolar weighting) to enforce geometric consistency. Empirical results on ScanNet and Replica show substantial improvements in F-score and depth accuracy under sparse viewing, producing more complete and smoother surfaces than prior methods. The approach reduces reliance on dense inputs and enhances robustness to matching noise, offering practical improvements for real-world indoor reconstruction tasks.
Abstract
In recent years, reconstructing indoor scene geometry from multi-view images has achieved encouraging accomplishments. Current methods incorporate monocular priors into neural implicit surface models to achieve high-quality reconstructions. However, these methods require hundreds of images for scene reconstruction. When only a limited number of views are available as input, the performance of monocular priors deteriorates due to scale ambiguity, leading to the collapse of the reconstructed scene geometry. In this paper, we propose a new method, named Sparis, for indoor surface reconstruction from sparse views. Specifically, we investigate the impact of monocular priors on sparse scene reconstruction, introducing a novel prior based on inter-image matching information. Our prior offers more accurate depth information while ensuring cross-view matching consistency. Additionally, we employ an angular filter strategy and an epipolar matching weight function, aiming to reduce errors due to view matching inaccuracies, thereby refining the inter-image prior for improved reconstruction accuracy. The experiments conducted on widely used benchmarks demonstrate superior performance in sparse-view scene reconstruction.
