Interact, Instruct to Improve: A LLM-Driven Parallel Actor-Reasoner Framework for Enhancing Autonomous Vehicle Interactions
Shiyu Fang, Jiaqi Liu, Chengkai Xu, Chen Lv, Peng Hang, Jian Sun
TL;DR
This work tackles the challenge of real-time bidirectional AV–HV interaction by introducing a parallel Actor-Reasoner framework that leverages an LLM-driven Reasoner and a memory-based Actor to express and interpret driving intentions. The Reasoner performs Chain-of-Thought reasoning to infer HV intention and driving style while generating eHMI cues, and the Actor rapidly retrieves feasible actions from a partitioned, two-layer memory, enabling fast, context-aware decisions. Ablation studies and multi-vehicle experiments show the memory partitioning and retrieval mechanisms substantially improve safety and efficiency, with field tests confirming practical applicability in real-world traffic. The framework promises improved interpretability, adaptability to heterogeneous HVs, and scalable deployment for real-time AV-HV interactions.
Abstract
Autonomous Vehicles (AVs) have entered the commercialization stage, but their limited ability to interact and express intentions still poses challenges in interactions with Human-driven Vehicles (HVs). Recent advances in large language models (LLMs) enable bidirectional human-machine communication, but the conflict between slow inference speed and the need for real-time decision-making challenges practical deployment. To address these issues, this paper introduces a parallel Actor-Reasoner framework designed to enable explicit bidirectional AV-HV interactions across multiple scenarios. First, by facilitating interactions between the LLM-driven Reasoner and heterogeneous simulated HVs during training, an interaction memory database, referred to as the Actor, is established. Then, by introducing the memory partition module and the two-layer memory retrieval module, the Actor's ability to handle heterogeneous HVs is significantly enhanced. Ablation studies and comparisons with other decision-making methods demonstrate that the proposed Actor-Reasoner framework significantly improves safety and efficiency. Finally, with the combination of the external Human-Machine Interface (eHMI) information derived from Reasoner's reasoning and the feasible action solutions retrieved from the Actor, the effectiveness of the proposed Actor-Reasoner is confirmed in multi-scenario field interactions. Our code is available at https://github.com/FanGShiYuu/Actor-Reasoner.
