LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search
Ruiyang Wang, Hao-Lun Hsu, David Hunt, Shaocheng Luo, Jiwoo Kim, Miroslav Pajic
TL;DR
The paper tackles efficient multi-robot exploration and object search in unknown indoor environments by introducing LLM-MCoX, a centralized planner that fuses a shared LiDAR-based occupancy map with representative frontiers, doorway cues, and natural-language guidance. By encoding the global map as a grayscale image and leveraging multimodal reasoning in an LLM, LLM-MCoX assigns long-horizon, semantically aware waypoint sequences to each robot, with execution feedback and plan memory to sustain coherence. Across simulations and real-world experiments, the approach outperforms greedy and DVC baselines, achieving up to 22.7% faster exploration and up to 50% better search efficiency in large, six-robot teams, and benefits further from natural-language hints in language-guided search. This work demonstrates the practical viability of semantic-aware, centralized planning for scalable multi-robot coordination in complex, partially observed environments.
Abstract
Autonomous exploration and object search in unknown indoor environments remain challenging for multi-robot systems (MRS). Traditional approaches often rely on greedy frontier assignment strategies with limited inter-robot coordination. In this work, we introduce LLM-MCoX (LLM-based Multi-robot Coordinated Exploration and Search), a novel framework that leverages Large Language Models (LLMs) for intelligent coordination of both homogeneous and heterogeneous robot teams tasked with efficient exploration and target object search. Our approach combines real-time LiDAR scan processing for frontier cluster extraction and doorway detection with multimodal LLM reasoning (e.g., GPT-4o) to generate coordinated waypoint assignments based on shared environment maps and robot states. LLM-MCoX demonstrates superior performance compared to existing methods, including greedy and Voronoi-based planners, achieving 22.7% faster exploration times and 50% improved search efficiency in large environments with 6 robots. Notably, LLM-MCoX enables natural language-based object search capabilities, allowing human operators to provide high-level semantic guidance that traditional algorithms cannot interpret.
