Mapping at First Sense: A Lightweight Neural Network-Based Indoor Structures Prediction Method for Robot Autonomous Exploration
Haojia Gao, Haohua Que, Kunrong Li, Weihao Shan, Mingkai Liu, Rong Zhao, Lei Mu, Xinghua Yang, Qi Wei, Fei Qiao
TL;DR
The paper tackles efficient autonomous exploration in unknown indoor environments by predicting unobserved map regions to guide planning. It introduces SenseMapNet, a lightweight dual-branch architecture that fuses convolutional encoding with a Transformer encoder to predict local occluded regions from the local observation map. A SenseMapDataset built from KTH and HouseExpo enables training and evaluation, with extensive comparisons to frontier-based exploration. Results show SenseMapNet achieves map reconstruction quality (SSIM 0.78, LPIPS 0.68, FID 239.79), reduces exploration time by 46.5% to 1248.68 s, and attains 88% coverage and 88% reconstruction accuracy, demonstrating practical benefits for indoor robotic exploration.
Abstract
Autonomous exploration in unknown environments is a critical challenge in robotics, particularly for applications such as indoor navigation, search and rescue, and service robotics. Traditional exploration strategies, such as frontier-based methods, often struggle to efficiently utilize prior knowledge of structural regularities in indoor spaces. To address this limitation, we propose Mapping at First Sense, a lightweight neural network-based approach that predicts unobserved areas in local maps, thereby enhancing exploration efficiency. The core of our method, SenseMapNet, integrates convolutional and transformerbased architectures to infer occluded regions while maintaining computational efficiency for real-time deployment on resourceconstrained robots. Additionally, we introduce SenseMapDataset, a curated dataset constructed from KTH and HouseExpo environments, which facilitates training and evaluation of neural models for indoor exploration. Experimental results demonstrate that SenseMapNet achieves an SSIM (structural similarity) of 0.78, LPIPS (perceptual quality) of 0.68, and an FID (feature distribution alignment) of 239.79, outperforming conventional methods in map reconstruction quality. Compared to traditional frontier-based exploration, our method reduces exploration time by 46.5% (from 2335.56s to 1248.68s) while maintaining a high coverage rate (88%) and achieving a reconstruction accuracy of 88%. The proposed method represents a promising step toward efficient, learning-driven robotic exploration in structured environments.
