MA-SLAM: Active SLAM in Large-Scale Unknown Environment using Map Aware Deep Reinforcement Learning
Yizhen Yin, Yuhua Qi, Dapeng Feng, Hongbo Chen, Hongjun Ma, Jin Wu, Yi Jiang
TL;DR
MA-SLAM tackles active SLAM in large-scale unknown environments by introducing a map-aware DRL framework guided by a structured map representation. It decouples high-level decision making from low-level control, leveraging a long-horizon global planner and an Action Optimization Unit to produce actionable waypoints. The key contributions are a discretized ROI-based map encoding, dynamic boundary point processing, and a PPO-based policy augmented with map-aware rewards, validated in Gazebo and on a real UGV, showing reduced exploration time and path length relative to frontier-, RRT-, TARE-, and DA-SLAM baselines. The results suggest practical viability for efficient large-scale exploration.
Abstract
Active Simultaneous Localization and Mapping (Active SLAM) involves the strategic planning and precise control of a robotic system's movement in order to construct a highly accurate and comprehensive representation of its surrounding environment, which has garnered significant attention within the research community. While the current methods demonstrate efficacy in small and controlled settings, they face challenges when applied to large-scale and diverse environments, marked by extended periods of exploration and suboptimal paths of discovery. In this paper, we propose MA-SLAM, a Map-Aware Active SLAM system based on Deep Reinforcement Learning (DRL), designed to address the challenge of efficient exploration in large-scale environments. In pursuit of this objective, we put forward a novel structured map representation. By discretizing the spatial data and integrating the boundary points and the historical trajectory, the structured map succinctly and effectively encapsulates the visited regions, thereby serving as input for the deep reinforcement learning based decision module. Instead of sequentially predicting the next action step within the decision module, we have implemented an advanced global planner to optimize the exploration path by leveraging long-range target points. We conducted experiments in three simulation environments and deployed in a real unmanned ground vehicle (UGV), the results demonstrate that our approach significantly reduces both the duration and distance of exploration compared with state-of-the-art methods.
