Towards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots
Lei Tai, Ming Liu
TL;DR
The paper tackles autonomous exploration for mobile robots in unknown indoor environments by using raw depth data as input to an end-to-end deep reinforcement learning framework. It initializes the DRL network from a CNN trained on real-world data and trains online in Gazebo-based simulations with collision-based rewards, enabling adaptation to unseen scenes without labeling. Results show that the DRL approach outperforms CNN-based supervised and traditional RL baselines and can transfer from simulation to real-world settings, with receptive-field visualization providing interpretability of traversability decisions. The study suggests promising directions, including incorporating RGB inputs and modern CNN backbones to further enhance perception and control in broader environments.
Abstract
Exploration in an unknown environment is the core functionality for mobile robots. Learning-based exploration methods, including convolutional neural networks, provide excellent strategies without human-designed logic for the feature extraction. But the conventional supervised learning algorithms cost lots of efforts on the labeling work of datasets inevitably. Scenes not included in the training set are mostly unrecognized either. We propose a deep reinforcement learning method for the exploration of mobile robots in an indoor environment with the depth information from an RGB-D sensor only. Based on the Deep Q-Network framework, the raw depth image is taken as the only input to estimate the Q values corresponding to all moving commands. The training of the network weights is end-to-end. In arbitrarily constructed simulation environments, we show that the robot can be quickly adapted to unfamiliar scenes without any man-made labeling. Besides, through analysis of receptive fields of feature representations, deep reinforcement learning motivates the convolutional networks to estimate the traversability of the scenes. The test results are compared with the exploration strategies separately based on deep learning or reinforcement learning. Even trained only in the simulated environment, experimental results in real-world environment demonstrate that the cognitive ability of robot controller is dramatically improved compared with the supervised method. We believe it is the first time that raw sensor information is used to build cognitive exploration strategy for mobile robots through end-to-end deep reinforcement learning.
