Table of Contents
Fetching ...

MARVEL: Multi-Agent Reinforcement Learning for constrained field-of-View multi-robot Exploration in Large-scale environments

Jimmy Chiun, Shizhe Zhang, Yizhuo Wang, Yuhong Cao, Guillaume Sartoretti

TL;DR

MARVEL tackles multi-agent exploration with constrained FoV sensors by formulating the problem as a multi-agent reinforcement learning task and solving it with a graph-attention policy and a centralized critic under the CTDE paradigm, augmented by an information-driven action pruning strategy. The method fuses frontier signals and orientation information through a graph-based encoder–decoder, enabling non-myopic viewpoint planning and robust coordination across diverse sensor configurations. Evaluations on 100 unseen $90\text{ m} \times 90\text{ m}$ maps with 4 agents show MARVEL outperforming state-of-the-art planners in trajectory length, 90% exploration, and achieving a 100% success rate, with low decision latency and strong generalization. Real-world validation on Crazyflie drones confirms practical applicability for lightweight aerial platforms in indoor environments.

Abstract

In multi-robot exploration, a team of mobile robot is tasked with efficiently mapping an unknown environments. While most exploration planners assume omnidirectional sensors like LiDAR, this is impractical for small robots such as drones, where lightweight, directional sensors like cameras may be the only option due to payload constraints. These sensors have a constrained field-of-view (FoV), which adds complexity to the exploration problem, requiring not only optimal robot positioning but also sensor orientation during movement. In this work, we propose MARVEL, a neural framework that leverages graph attention networks, together with novel frontiers and orientation features fusion technique, to develop a collaborative, decentralized policy using multi-agent reinforcement learning (MARL) for robots with constrained FoV. To handle the large action space of viewpoints planning, we further introduce a novel information-driven action pruning strategy. MARVEL improves multi-robot coordination and decision-making in challenging large-scale indoor environments, while adapting to various team sizes and sensor configurations (i.e., FoV and sensor range) without additional training. Our extensive evaluation shows that MARVEL's learned policies exhibit effective coordinated behaviors, outperforming state-of-the-art exploration planners across multiple metrics. We experimentally demonstrate MARVEL's generalizability in large-scale environments, of up to 90m by 90m, and validate its practical applicability through successful deployment on a team of real drone hardware.

MARVEL: Multi-Agent Reinforcement Learning for constrained field-of-View multi-robot Exploration in Large-scale environments

TL;DR

MARVEL tackles multi-agent exploration with constrained FoV sensors by formulating the problem as a multi-agent reinforcement learning task and solving it with a graph-attention policy and a centralized critic under the CTDE paradigm, augmented by an information-driven action pruning strategy. The method fuses frontier signals and orientation information through a graph-based encoder–decoder, enabling non-myopic viewpoint planning and robust coordination across diverse sensor configurations. Evaluations on 100 unseen maps with 4 agents show MARVEL outperforming state-of-the-art planners in trajectory length, 90% exploration, and achieving a 100% success rate, with low decision latency and strong generalization. Real-world validation on Crazyflie drones confirms practical applicability for lightweight aerial platforms in indoor environments.

Abstract

In multi-robot exploration, a team of mobile robot is tasked with efficiently mapping an unknown environments. While most exploration planners assume omnidirectional sensors like LiDAR, this is impractical for small robots such as drones, where lightweight, directional sensors like cameras may be the only option due to payload constraints. These sensors have a constrained field-of-view (FoV), which adds complexity to the exploration problem, requiring not only optimal robot positioning but also sensor orientation during movement. In this work, we propose MARVEL, a neural framework that leverages graph attention networks, together with novel frontiers and orientation features fusion technique, to develop a collaborative, decentralized policy using multi-agent reinforcement learning (MARL) for robots with constrained FoV. To handle the large action space of viewpoints planning, we further introduce a novel information-driven action pruning strategy. MARVEL improves multi-robot coordination and decision-making in challenging large-scale indoor environments, while adapting to various team sizes and sensor configurations (i.e., FoV and sensor range) without additional training. Our extensive evaluation shows that MARVEL's learned policies exhibit effective coordinated behaviors, outperforming state-of-the-art exploration planners across multiple metrics. We experimentally demonstrate MARVEL's generalizability in large-scale environments, of up to 90m by 90m, and validate its practical applicability through successful deployment on a team of real drone hardware.

Paper Structure

This paper contains 12 sections, 1 equation, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Illustration of multi-robot exploration. 3 drones are collaboratively exploring an indoor environment (blue region). The start region is indicated by the yellow star.
  • Figure 2: MARVEL's policy and critic network architecture. We proposed a policy and critic network that leverage on graph-based attention. In the graphs, blue circles indicates the nodes that are connected by edges, indicates as tan lines. We also extract the frontiers (red dots) distribution of each nodes to provide more context to our neural networks.
  • Figure 3: Train/Test environments. Green box: the starting region, which is kept constant for all evaluations conducted.
  • Figure 4: Experimental validation on four nano drones. The grey mat demarcates the 4m by 4m flying arena.