D2M2N: Decentralized Differentiable Memory-Enabled Mapping and Navigation for Multiple Robots
Md Ishat-E-Rabban, Pratap Tokekar
TL;DR
D2M2N tackles memory limitations in multi-robot navigation by equipping each robot with a differentiable memory that stores a compact occupancy belief as an embedding $m_i^t$ and by using a Value Iteration Network (VIN) as the action selector. The architecture separates memory maintenance (encoder–decoder–aggregator) from planning (VIN), enabling communication of compressed embeddings to neighboring robots and end-to-end differentiability under CTDE. Empirical results show substantial gains over MAGAT, particularly in complex maps and under partial observability, with robustness to moderate sensor noise and improved performance in multi-goal tasks. The approach reduces communication overhead while preserving planning quality, suggesting practical benefits for scalable, decentralized multi-robot systems.
Abstract
Recently, a number of learning-based models have been proposed for multi-robot navigation. However, these models lack memory and only rely on the current observations of the robot to plan their actions. They are unable to leverage past observations to plan better paths, especially in complex environments. In this work, we propose a fully differentiable and decentralized memory-enabled architecture for multi-robot navigation and mapping called D2M2N. D2M2N maintains a compact representation of the environment to remember past observations and uses Value Iteration Network for complex navigation. We conduct extensive experiments to show that D2M2N significantly outperforms the state-of-the-art model in complex mapping and navigation task.
