Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances
Yaozu Wu, Dongyuan Li, Yankai Chen, Renhe Jiang, Henry Peng Zou, Wei-Chieh Huang, Yangning Li, Liancheng Fang, Zhen Wang, Philip S. Yu
TL;DR
This survey addresses the integration of Large Language Models into autonomous driving by focusing on multi-agent systems that overcome the perception, collaboration, and compute limits of single-agent ADSs. It develops a taxonomy around agent environments/profiles, inter-agent interaction modes, and agent-human collaboration, and reviews three primary interaction types: multi-vehicle, vehicle-infrastructure, and vehicle-assistant, along with applications, datasets, and benchmarks. The work highlights mechanisms for collaborative perception and decision-making, as well as cloud-edge deployment and assistant-tools, while outlining key challenges such as safety, privacy, and real-world deployment. By mapping existing methods and resources, the paper aims to guide NLP researchers and the autonomous driving community toward scalable, safe, and human-aligned LLM-based multi-agent ADSs.
Abstract
Autonomous Driving Systems (ADSs) are revolutionizing transportation by reducing human intervention, improving operational efficiency, and enhancing safety. Large Language Models (LLMs) have been integrated into ADSs to support high-level decision-making through their powerful reasoning, instruction-following, and communication abilities. However, LLM-based single-agent ADSs face three major challenges: limited perception, insufficient collaboration, and high computational demands. To address these issues, recent advances in LLM-based multi-agent ADSs leverage language-driven communication and coordination to enhance inter-agent collaboration. This paper provides a frontier survey of this emerging intersection between NLP and multi-agent ADSs. We begin with a background introduction to related concepts, followed by a categorization of existing LLM-based methods based on different agent interaction modes. We then discuss agent-human interactions in scenarios where LLM-based agents engage with humans. Finally, we summarize key applications, datasets, and challenges to support future research.
