EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents
Junting Chen, Checheng Yu, Xunzhe Zhou, Tianqi Xu, Yao Mu, Mengkang Hu, Wenqi Shao, Yikai Wang, Guohao Li, Lin Shao
TL;DR
This work tackles the challenge of coordinating heterogeneous robots with differing embodiments under a large-language-model-based multi-agent framework. It introduces EMOS, an embodiment-aware MAS that performs self-generated robot resumes from URDFs and uses a three-stage, hierarchical planning-execution pipeline to coordinate HMRS in indoor environments. A key contribution is the Habitat-MAS benchmark, a simulation-based, multi-floor dataset featuring drones, wheeled, and legged robots to evaluate embodiment-aware reasoning across perception, navigation, and manipulation tasks. Experimental results, including ablations, demonstrate that robot resumes and embodiment-aware reasoning substantially improve task success while revealing trade-offs in token usage and scalability, positioning EMOS as a step toward higher automation in complex HMRS settings.
Abstract
Heterogeneous multi-robot systems (HMRS) have emerged as a powerful approach for tackling complex tasks that single robots cannot manage alone. Current large-language-model-based multi-agent systems (LLM-based MAS) have shown success in areas like software development and operating systems, but applying these systems to robot control presents unique challenges. In particular, the capabilities of each agent in a multi-robot system are inherently tied to the physical composition of the robots, rather than predefined roles. To address this issue, we introduce a novel multi-agent framework designed to enable effective collaboration among heterogeneous robots with varying embodiments and capabilities, along with a new benchmark named Habitat-MAS. One of our key designs is $\textit{Robot Resume}$: Instead of adopting human-designed role play, we propose a self-prompted approach, where agents comprehend robot URDF files and call robot kinematics tools to generate descriptions of their physics capabilities to guide their behavior in task planning and action execution. The Habitat-MAS benchmark is designed to assess how a multi-agent framework handles tasks that require embodiment-aware reasoning, which includes 1) manipulation, 2) perception, 3) navigation, and 4) comprehensive multi-floor object rearrangement. The experimental results indicate that the robot's resume and the hierarchical design of our multi-agent system are essential for the effective operation of the heterogeneous multi-robot system within this intricate problem context.
