Table of Contents
Fetching ...

Asymmetric Information Enhanced Mapping Framework for Multirobot Exploration based on Deep Reinforcement Learning

Jiyu Cheng, Junhui Fan, Xiaolei Li, Paul L. Rosin, Yibin Li, Wei Zhang

TL;DR

This work addresses efficient multirobot exploration in unknown environments by introducing AIM-Mapping, a framework that exploits privileged information during training via an asymmetric actor–critic design and mutual information guidance. The method combines an Asymmetric Feature Representation, a Mutual Information Evaluation module, and a Graph-based Multi-Agent Decision-making Network to construct a topological environment representation and assign long-term goals through graph matching. Empirical results in iGibson simulations and real-world trials show AIM-Mapping achieves higher exploration efficiency and map completeness than several baselines, with strong generalization to different robot counts and scenes. The approach offers a scalable, information-rich strategy for cooperative exploration with practical relevance to autonomous inspection and search tasks.

Abstract

Despite the great development of multirobot technologies, efficiently and collaboratively exploring an unknown environment is still a big challenge. In this paper, we propose AIM-Mapping, a Asymmetric InforMation Enhanced Mapping framework. The framework fully utilizes the privilege information in the training process to help construct the environment representation as well as the supervised signal in an asymmetric actor-critic training framework. Specifically, privilege information is used to evaluate the exploration performance through an asymmetric feature representation module and a mutual information evaluation module. The decision-making network uses the trained feature encoder to extract structure information from the environment and combines it with a topological map constructed based on geometric distance. Utilizing this kind of topological map representation, we employ topological graph matching to assign corresponding boundary points to each robot as long-term goal points. We conduct experiments in real-world-like scenarios using the Gibson simulation environments. It validates that the proposed method, when compared to existing methods, achieves great performance improvement.

Asymmetric Information Enhanced Mapping Framework for Multirobot Exploration based on Deep Reinforcement Learning

TL;DR

This work addresses efficient multirobot exploration in unknown environments by introducing AIM-Mapping, a framework that exploits privileged information during training via an asymmetric actor–critic design and mutual information guidance. The method combines an Asymmetric Feature Representation, a Mutual Information Evaluation module, and a Graph-based Multi-Agent Decision-making Network to construct a topological environment representation and assign long-term goals through graph matching. Empirical results in iGibson simulations and real-world trials show AIM-Mapping achieves higher exploration efficiency and map completeness than several baselines, with strong generalization to different robot counts and scenes. The approach offers a scalable, information-rich strategy for cooperative exploration with practical relevance to autonomous inspection and search tasks.

Abstract

Despite the great development of multirobot technologies, efficiently and collaboratively exploring an unknown environment is still a big challenge. In this paper, we propose AIM-Mapping, a Asymmetric InforMation Enhanced Mapping framework. The framework fully utilizes the privilege information in the training process to help construct the environment representation as well as the supervised signal in an asymmetric actor-critic training framework. Specifically, privilege information is used to evaluate the exploration performance through an asymmetric feature representation module and a mutual information evaluation module. The decision-making network uses the trained feature encoder to extract structure information from the environment and combines it with a topological map constructed based on geometric distance. Utilizing this kind of topological map representation, we employ topological graph matching to assign corresponding boundary points to each robot as long-term goal points. We conduct experiments in real-world-like scenarios using the Gibson simulation environments. It validates that the proposed method, when compared to existing methods, achieves great performance improvement.
Paper Structure (31 sections, 20 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 31 sections, 20 equations, 11 figures, 5 tables, 1 algorithm.

Figures (11)

  • Figure 1: Multirobot collaborative active mapping task contains three main sub-task modules: perception and map creation, long-term goal selection, and short-term path planning.
  • Figure 2: The overall AIM-Mapping framework. Asymmetric Feature Representation is used to generate the state value and the partial-map feature mapping. Multi-Agent Decision-making Network combines the geometric distance information and structural information to formulate the topological graph representation, and adopts graph matching to generate the corresponding goal point. Mutual Information Evaluation is utilized to facilitate the training process. Solid black arrows represent the forward data flow through the network. Brown dashed arrows indicate the gradient backpropagation paths used during training for updating network parameters.
  • Figure 3: Multi-agent decision-making network based on topological graph matching. This framework concatenates internal and external information fusion of the graph, completing graph matching between the representation of robots and boundary points, and assigning corresponding boundary points as long-term target points for each robot.
  • Figure 4: Training performance comparison. The training results of the proposed AIM-Mapping and the baseline method NCM, as well as the comparison of the performance of the planning-based baseline methods (Utility, mTSP, Voronoi, and CoScan) on the training set.
  • Figure 5: Illustration of average exploration rate variation during test episodes.
  • ...and 6 more figures