Table of Contents
Fetching ...

Challenges Faced by Large Language Models in Solving Multi-Agent Flocking

Peihan Li, Vishnu Menon, Bhavanaraj Gudiguntla, Daniel Ting, Lifeng Zhou

TL;DR

The paper investigates whether per-agent LLMs can enable decentralized flocking in multi-agent systems governed by Boids rules. It proposes a framework where each agent runs its own LLM guided by role-based prompts and round-based updates, testing formations such as circle, triangle, and line with five agents. The findings show that GPT-3.5-Turbo frequently fails to maintain prescribed inter-agent distances and tends to converge to a single point, with GPT-4-Turbo also exhibiting high format-violation failures, underscoring gaps in spatial and collaborative reasoning. The study highlights fundamental limitations of current LLMs for spatially grounded, decentralized coordination and points to future work on improving spatial reasoning, incorporating visual data, and decomposing tasks to enable robust multi-agent flocking.

Abstract

Flocking is a behavior where multiple agents in a system attempt to stay close to each other while avoiding collision and maintaining a desired formation. This is observed in the natural world and has applications in robotics, including natural disaster search and rescue, wild animal tracking, and perimeter surveillance and patrol. Recently, large language models (LLMs) have displayed an impressive ability to solve various collaboration tasks as individual decision-makers. Solving multi-agent flocking with LLMs would demonstrate their usefulness in situations requiring spatial and decentralized decision-making. Yet, when LLM-powered agents are tasked with implementing multi-agent flocking, they fall short of the desired behavior. After extensive testing, we find that agents with LLMs as individual decision-makers typically opt to converge on the average of their initial positions or diverge from each other. After breaking the problem down, we discover that LLMs cannot understand maintaining a shape or keeping a distance in a meaningful way. Solving multi-agent flocking with LLMs would enhance their ability to understand collaborative spatial reasoning and lay a foundation for addressing more complex multi-agent tasks. This paper discusses the challenges LLMs face in multi-agent flocking and suggests areas for future improvement and research.

Challenges Faced by Large Language Models in Solving Multi-Agent Flocking

TL;DR

The paper investigates whether per-agent LLMs can enable decentralized flocking in multi-agent systems governed by Boids rules. It proposes a framework where each agent runs its own LLM guided by role-based prompts and round-based updates, testing formations such as circle, triangle, and line with five agents. The findings show that GPT-3.5-Turbo frequently fails to maintain prescribed inter-agent distances and tends to converge to a single point, with GPT-4-Turbo also exhibiting high format-violation failures, underscoring gaps in spatial and collaborative reasoning. The study highlights fundamental limitations of current LLMs for spatially grounded, decentralized coordination and points to future work on improving spatial reasoning, incorporating visual data, and decomposing tasks to enable robust multi-agent flocking.

Abstract

Flocking is a behavior where multiple agents in a system attempt to stay close to each other while avoiding collision and maintaining a desired formation. This is observed in the natural world and has applications in robotics, including natural disaster search and rescue, wild animal tracking, and perimeter surveillance and patrol. Recently, large language models (LLMs) have displayed an impressive ability to solve various collaboration tasks as individual decision-makers. Solving multi-agent flocking with LLMs would demonstrate their usefulness in situations requiring spatial and decentralized decision-making. Yet, when LLM-powered agents are tasked with implementing multi-agent flocking, they fall short of the desired behavior. After extensive testing, we find that agents with LLMs as individual decision-makers typically opt to converge on the average of their initial positions or diverge from each other. After breaking the problem down, we discover that LLMs cannot understand maintaining a shape or keeping a distance in a meaningful way. Solving multi-agent flocking with LLMs would enhance their ability to understand collaborative spatial reasoning and lay a foundation for addressing more complex multi-agent tasks. This paper discusses the challenges LLMs face in multi-agent flocking and suggests areas for future improvement and research.
Paper Structure (12 sections, 3 equations, 4 figures)

This paper contains 12 sections, 3 equations, 4 figures.

Figures (4)

  • Figure 1: MAE of ten tests for different numbers of agents and different flock formations. The red dashed line shows the desired MAE (0.2 margin).
  • Figure 2: Snapshots of the trajectory of selected tests consisting of five agents at different rounds. (a)-(d) represents the flock tasked to form a circle pattern. (e) - (h) represents the flock tasked to form an $\alpha$-lattice pattern. (i)-(l) represents the flock tasked to form a V-shape.
  • Figure 3: Snapshots of the trajectory from a selected test with five agents forming a circle. The desired distance between each agent is 5 units. Here, we focus on the reasoning and decisions of Agent 4's LLM.
  • Figure 4: Snapshots of the trajectory from a sample test with one active and one stationary agent. The desired distance is 10 units. The corresponding reasoning from the active agent for each round is shown on the right.