NVSim: Novel View Synthesis Simulator for Large Scale Indoor Navigation
Mingyu Jeong, Eunsung Kim, Sehun Park, Andrew Jaeyong Choi
TL;DR
NVSim addresses the scalability and realism gap in indoor VLN simulators by automatically building navigable environments from traversal image sequences. It advances a two-stage approach: (i) scalable 3D scene representation via submaps using Floor-Aware Gaussian Splatting to suppress floor artifacts, and (ii) mesh-free traversability to construct a topological graph G=(V,E) directly from rendered views. Its contributions include a Hybrid Floor Segmentation method, a spherical-harmonics background for floor regions, and a BFS-based topomap generation that yields valid navigable graphs without meshes. Evaluations on the COEX dataset show effective floor artifact removal, robust topological maps, and feasible R2R-style navigation, illustrating NVSim’s potential to enable scalable VLN research and broader embodied AI applications.
Abstract
We present NVSim, a framework that automatically constructs large-scale, navigable indoor simulators from only common image sequences, overcoming the cost and scalability limitations of traditional 3D scanning. Our approach adapts 3D Gaussian Splatting to address visual artifacts on sparsely observed floors a common issue in robotic traversal data. We introduce Floor-Aware Gaussian Splatting to ensure a clean, navigable ground plane, and a novel mesh-free traversability checking algorithm that constructs a topological graph by directly analyzing rendered views. We demonstrate our system's ability to generate valid, large-scale navigation graphs from real-world data. A video demonstration is avilable at https://youtu.be/tTiIQt6nXC8
