Asymmetric Information Enhanced Mapping Framework for Multirobot Exploration based on Deep Reinforcement Learning
Jiyu Cheng, Junhui Fan, Xiaolei Li, Paul L. Rosin, Yibin Li, Wei Zhang
TL;DR
This work addresses efficient multirobot exploration in unknown environments by introducing AIM-Mapping, a framework that exploits privileged information during training via an asymmetric actor–critic design and mutual information guidance. The method combines an Asymmetric Feature Representation, a Mutual Information Evaluation module, and a Graph-based Multi-Agent Decision-making Network to construct a topological environment representation and assign long-term goals through graph matching. Empirical results in iGibson simulations and real-world trials show AIM-Mapping achieves higher exploration efficiency and map completeness than several baselines, with strong generalization to different robot counts and scenes. The approach offers a scalable, information-rich strategy for cooperative exploration with practical relevance to autonomous inspection and search tasks.
Abstract
Despite the great development of multirobot technologies, efficiently and collaboratively exploring an unknown environment is still a big challenge. In this paper, we propose AIM-Mapping, a Asymmetric InforMation Enhanced Mapping framework. The framework fully utilizes the privilege information in the training process to help construct the environment representation as well as the supervised signal in an asymmetric actor-critic training framework. Specifically, privilege information is used to evaluate the exploration performance through an asymmetric feature representation module and a mutual information evaluation module. The decision-making network uses the trained feature encoder to extract structure information from the environment and combines it with a topological map constructed based on geometric distance. Utilizing this kind of topological map representation, we employ topological graph matching to assign corresponding boundary points to each robot as long-term goal points. We conduct experiments in real-world-like scenarios using the Gibson simulation environments. It validates that the proposed method, when compared to existing methods, achieves great performance improvement.
