Table of Contents
Fetching ...

CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic

Huaiyuan Yao, Longchao Da, Vishnu Nandam, Justin Turnau, Zhiwei Liu, Linsey Pang, Hua Wei

TL;DR

CoMAL introduces a knowledge-driven, collaborative multi-agent framework for mixed-autonomy traffic, leveraging LLMs to coordinate CAVs with human drivers. It combines a Perception/Memor y-enabled single-agent pipeline with a three-module multi-agent workflow (Collaboration, Reason Engine, Execution) to generate IDM-based planners that are executed via a rule-based controller. Experimental results on Flow benchmarks (Ring, Figure Eight, Merge) show improvements in average velocity and traffic stability across several LLMs, with ablations validating the contributions of perception, memory, and collaboration. The work underscores the potential of LLM-driven coordination to complement traditional control and learning-based approaches, and it points to scaling and hybrid RL-LLM strategies as promising future directions. Specifically, CoMAL encodes IDM dynamics through $a_k = \frac{dv_k}{dt} = a_{\max} [1 - (\frac{v_k}{v_0})^{\delta} - (\frac{s^*(v_k, \Delta v_k)}{s_k})^{2}]$ with $s^*(v_k, \Delta v_k) = s_0 + \max(0, v_k T + \frac{v_k \Delta v_k}{2\sqrt{a_{\max} b}})$ to translate high-level plans into actionable control, enabling effective, interpretable collaboration among heterogeneous agents.

Abstract

The integration of autonomous vehicles into urban traffic has great potential to improve efficiency by reducing congestion and optimizing traffic flow systematically. In this paper, we introduce CoMAL (Collaborative Multi-Agent LLMs), a framework designed to address the mixed-autonomy traffic problem by collaboration among autonomous vehicles to optimize traffic flow. CoMAL is built upon large language models, operating in an interactive traffic simulation environment. It utilizes a Perception Module to observe surrounding agents and a Memory Module to store strategies for each agent. The overall workflow includes a Collaboration Module that encourages autonomous vehicles to discuss the effective strategy and allocate roles, a reasoning engine to determine optimal behaviors based on assigned roles, and an Execution Module that controls vehicle actions using a hybrid approach combining rule-based models. Experimental results demonstrate that CoMAL achieves superior performance on the Flow benchmark. Additionally, we evaluate the impact of different language models and compare our framework with reinforcement learning approaches. It highlights the strong cooperative capability of LLM agents and presents a promising solution to the mixed-autonomy traffic challenge. The code is available at https://github.com/Hyan-Yao/CoMAL.

CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic

TL;DR

CoMAL introduces a knowledge-driven, collaborative multi-agent framework for mixed-autonomy traffic, leveraging LLMs to coordinate CAVs with human drivers. It combines a Perception/Memor y-enabled single-agent pipeline with a three-module multi-agent workflow (Collaboration, Reason Engine, Execution) to generate IDM-based planners that are executed via a rule-based controller. Experimental results on Flow benchmarks (Ring, Figure Eight, Merge) show improvements in average velocity and traffic stability across several LLMs, with ablations validating the contributions of perception, memory, and collaboration. The work underscores the potential of LLM-driven coordination to complement traditional control and learning-based approaches, and it points to scaling and hybrid RL-LLM strategies as promising future directions. Specifically, CoMAL encodes IDM dynamics through with to translate high-level plans into actionable control, enabling effective, interpretable collaboration among heterogeneous agents.

Abstract

The integration of autonomous vehicles into urban traffic has great potential to improve efficiency by reducing congestion and optimizing traffic flow systematically. In this paper, we introduce CoMAL (Collaborative Multi-Agent LLMs), a framework designed to address the mixed-autonomy traffic problem by collaboration among autonomous vehicles to optimize traffic flow. CoMAL is built upon large language models, operating in an interactive traffic simulation environment. It utilizes a Perception Module to observe surrounding agents and a Memory Module to store strategies for each agent. The overall workflow includes a Collaboration Module that encourages autonomous vehicles to discuss the effective strategy and allocate roles, a reasoning engine to determine optimal behaviors based on assigned roles, and an Execution Module that controls vehicle actions using a hybrid approach combining rule-based models. Experimental results demonstrate that CoMAL achieves superior performance on the Flow benchmark. Additionally, we evaluate the impact of different language models and compare our framework with reinforcement learning approaches. It highlights the strong cooperative capability of LLM agents and presents a promising solution to the mixed-autonomy traffic challenge. The code is available at https://github.com/Hyan-Yao/CoMAL.

Paper Structure

This paper contains 31 sections, 2 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The overall framework of CoMAL. (a) Single-Agent Pipeline: The prompt generator integrates scenario descriptions, few-shot experiences, and shared messages, which are then fed into the LLM. The LLM subsequently allocates tasks and generates planners. (b) Multi-Agent Workflow comprises three modules: the Collaboration Module, the Reason Engine, and the Execution Module. (c) The three Benchmarks Scenarios for CoMAL Ring: The ring road network consists of a closed-loop road where vehicles continuously travel in a circular fashion. Figure Eight (FE): is an extension of the ring road, consisting of two circular loops connected by an intersection. Merge: The merged network simulates how vehicles entering from an on-ramp cause disturbances.
  • Figure 2: (a) Left: A detailed prompt example for CoMAL, consisting of a system prompt that specifies the driving task, along with map description and motion state provided by the Perception Module. (b) Right: A case of the collaboration and reasoning process. Following task allocation during brainstorming, a hierarchical chain of thought breaks down the driving plan into incremental steps, ensuring consistency in decision-making. This process includes role clarification, scene understanding, motion instruction, and planner generation.
  • Figure 3: Demonstration of the interaction process of agents in the scenario Figure Eight 1. The agents decide to form a queue and subsequently allocate the roles of leader and follower.
  • Figure 4: Visualization of vehicle trajectories in Ring 0 setting. The ring road has a total length of 230 meters and contains 22 vehicles. Each line in the space-time diagrams shows the position of a specific vehicle over time, whose speed is indicated with different colors. When a vehicle completes a full lap of the ring, its position resets to zero. Left: In the absence of automated vehicles, human-driven vehicles exhibit stop-and-go shockwaves due to inherent instability. Right: With three connected autonomous vehicles using the CoMAL framework, the unstable vehicles are stabilized.