CAMON: Cooperative Agents for Multi-Object Navigation with LLM-based Conversations
Pengying Wu, Yao Mu, Kangjie Zhou, Ji Ma, Junting Chen, Chang Liu
TL;DR
CAMON tackles cooperative multi-object navigation in indoor environments by enabling multiple robots to communicate and coordinate via LLMs within a comm-triggered dynamic leadership framework. It fuses perception, planning, and control into a decentralized pipeline where each agent maintains a local semantic map, describes room-level scenes, and negotiates task division through LLM-driven proposals and leader coordination. Key contributions include a perception module for room-aware scene understanding, a dynamic leadership mechanism to balance information flow, and a planning workflow that minimizes communication while achieving fast consensus on actions and object targets. The approach promises robust, scalable collaboration for home-service robotics, with potential extensions to dynamic objects and cross-floor navigation.
Abstract
Visual navigation tasks are critical for household service robots. As these tasks become increasingly complex, effective communication and collaboration among multiple robots become imperative to ensure successful completion. In recent years, large language models (LLMs) have exhibited remarkable comprehension and planning abilities in the context of embodied agents. However, their application in household scenarios, specifically in the use of multiple agents collaborating to complete complex navigation tasks through communication, remains unexplored. Therefore, this paper proposes a framework for decentralized multi-agent navigation, leveraging LLM-enabled communication and collaboration. By designing the communication-triggered dynamic leadership organization structure, we achieve faster team consensus with fewer communication instances, leading to better navigation effectiveness and collaborative exploration efficiency. With the proposed novel communication scheme, our framework promises to be conflict-free and robust in multi-object navigation tasks, even when there is a surge in team size.
