CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving
Changxing Liu, Genjia Liu, Zijun Wang, Jinchang Yang, Siheng Chen
TL;DR
CoLMDriver addresses cooperative autonomous driving by enabling language-based negotiation among vehicles and translating outcomes into real-time control. The method adopts a parallel two-pipeline architecture: an LLM-based negotiator under an Actor-Critic loop and an intention-guided VLM planner that produces high-level driving intentions to guide an intention-guided waypoint planner. It is formalized as maximizing $\sum_{i=1}^N d(\Phi_\theta(\mathcal{X}_i, \mathcal{D}_i, \mathcal{M}_i^k))$ over multi-round negotiations, with dynamic grouping and a negotiation quality evaluator guiding policy refinement. InterDrive benchmarking in CARLA shows about an 11% improvement in Driving Score across diverse scenarios, illustrating the practical benefits of language-based coordination for V2V cooperation, while noting limitations in language interaction diversity for future expansion.
Abstract
Vehicle-to-vehicle (V2V) cooperative autonomous driving holds great promise for improving safety by addressing the perception and prediction uncertainties inherent in single-agent systems. However, traditional cooperative methods are constrained by rigid collaboration protocols and limited generalization to unseen interactive scenarios. While LLM-based approaches offer generalized reasoning capabilities, their challenges in spatial planning and unstable inference latency hinder their direct application in cooperative driving. To address these limitations, we propose CoLMDriver, the first full-pipeline LLM-based cooperative driving system, enabling effective language-based negotiation and real-time driving control. CoLMDriver features a parallel driving pipeline with two key components: (i) an LLM-based negotiation module under an actor-critic paradigm, which continuously refines cooperation policies through feedback from previous decisions of all vehicles; and (ii) an intention-guided waypoint generator, which translates negotiation outcomes into executable waypoints. Additionally, we introduce InterDrive, a CARLA-based simulation benchmark comprising 10 challenging interactive driving scenarios for evaluating V2V cooperation. Experimental results demonstrate that CoLMDriver significantly outperforms existing approaches, achieving an 11% higher success rate across diverse highly interactive V2V driving scenarios. Code will be released on https://github.com/cxliu0314/CoLMDriver.
