GenSwarm: Scalable Multi-Robot Code-Policy Generation and Deployment via Language Models

Wenkang Ji; Huaben Chen; Mingyang Chen; Guobin Zhu; Lufeng Xu; Roderich Groß; Rui Zhou; Ming Cao; Shiyu Zhao

GenSwarm: Scalable Multi-Robot Code-Policy Generation and Deployment via Language Models

Wenkang Ji, Huaben Chen, Mingyang Chen, Guobin Zhu, Lufeng Xu, Roderich Groß, Rui Zhou, Ming Cao, Shiyu Zhao

TL;DR

This work introduces GenSwarm, an end-to-end system that leverages large language models to automatically generate and deploy control policies for multi-robot tasks based on simple user instructions in natural language.

Abstract

The development of control policies for multi-robot systems traditionally follows a complex and labor-intensive process, often lacking the flexibility to adapt to dynamic tasks. This has motivated research on methods to automatically create control policies. However, these methods require iterative processes of manually crafting and refining objective functions, thereby prolonging the development cycle. This work introduces \textit{GenSwarm}, an end-to-end system that leverages large language models to automatically generate and deploy control policies for multi-robot tasks based on simple user instructions in natural language. As a multi-language-agent system, GenSwarm achieves zero-shot learning, enabling rapid adaptation to altered or unseen tasks. The white-box nature of the code policies ensures strong reproducibility and interpretability. With its scalable software and hardware architectures, GenSwarm supports efficient policy deployment on both simulated and real-world multi-robot systems, realizing an instruction-to-execution end-to-end functionality that could prove valuable for robotics specialists and non-specialists alike.The code of the proposed GenSwarm system is available online: https://github.com/WindyLab/GenSwarm.

GenSwarm: Scalable Multi-Robot Code-Policy Generation and Deployment via Language Models

TL;DR

Abstract

Paper Structure

This paper contains 20 sections, 12 equations, 14 figures.

Figures (14)

Figure 1: The pipeline of GenSwarm. GenSwarm consists of three modules: task analysis, code generation, and code deployment and improvement. The task analysis module extracts constraints from user instructions and builds a skill library. The code generation module uses a skill graph to hierarchically create and refine Python functions, ensuring constraint alignment and code reusability. Finally, the code deployment and improvement module enables automatic code deployment in simulation and real-world platforms, incorporating feedback from video analysis and human input to refine policies.
Figure 1: Performance evaluation for the encircling task subject to sensor noises. Encircling task performance under different noise levels: as noise increases from $\sigma_0 = 0$ to $1.0$, tracking error generally increases.
Figure 2: Software components of GenSwarm. A control station generates the required code based on the proposed pipeline and uses Ansible to wirelessly connect to each robot. First, each robot runs Playbook-defined tasks, such as installing and configuring the Docker environment. Then, two pre-built Docker images are pulled: one with the ROS environment for robot operation, and the other with the Python environment for code execution. Once the environments are ready, the generated code is transmitted to all robots and then executed onboard.
Figure 2: Performance comparison between different methods. The figure compares our method (GenSwarm) against three baselines (MetaGPT, CaP, LLM2Swarm) and fine-tuned state-of-the-art (SOTA) expert controllers on six tasks over one hundred trials each. All eight metrics are normalized for a lower-is-better evaluation. It can be seen that GenSwarm achieves the best results among the LLM-based methods, and its best-performing policies are competitive with the SOTA controllers.
Figure 3: Hardware components of GenSwarm. As a major upgrade of our previous robotic platform sun2023mean, each robot has the onboard computational, control, and communication resources to support autonomous code deployment and execution. The multi-robot system features one-click all start, one-click all sleep, and wireless data retrieval functions that can significantly reduce experimental costs. Since the robots do not have onboard vision systems, the perception was emulated with relevant motion information being collected by an indoor motion capture system, and then distributed to the robots through an MQTT coordination server, ensuring each robot receives only the local information of its surroundings.
...and 9 more figures