Hi-Map: Hierarchical Factorized Radiance Field for High-Fidelity Monocular Dense Mapping
Tongyan Hua, Haotian Bai, Zidong Cao, Ming Liu, Dacheng Tao, Lin Wang
TL;DR
Hi-Map addresses monocular dense mapping without depth priors by introducing a hierarchical factorized grid representation and a dual-path encoding strategy. It decouples geometry and appearance, applies SDF-based proxy rendering for stable density estimation, and performs online optimization within a sliding window to achieve real-time performance. The method demonstrates superior geometric and textural fidelity on the Replica dataset and shows robustness in textureless regions, outperforming state-of-the-art monocular NeRF-based methods. This approach reduces memory and computation while maintaining high-quality reconstructions, enabling practical dense mapping in depth-scarce scenarios.
Abstract
In this paper, we introduce Hi-Map, a novel monocular dense mapping approach based on Neural Radiance Field (NeRF). Hi-Map is exceptional in its capacity to achieve efficient and high-fidelity mapping using only posed RGB inputs. Our method eliminates the need for external depth priors derived from e.g., a depth estimation model. Our key idea is to represent the scene as a hierarchical feature grid that encodes the radiance and then factorizes it into feature planes and vectors. As such, the scene representation becomes simpler and more generalizable for fast and smooth convergence on new observations. This allows for efficient computation while alleviating noise patterns by reducing the complexity of the scene representation. Buttressed by the hierarchical factorized representation, we leverage the Sign Distance Field (SDF) as a proxy of rendering for inferring the volume density, demonstrating high mapping fidelity. Moreover, we introduce a dual-path encoding strategy to strengthen the photometric cues and further boost the mapping quality, especially for the distant and textureless regions. Extensive experiments demonstrate our method's superiority in geometric and textural accuracy over the state-of-the-art NeRF-based monocular mapping methods.
