Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles
Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
TL;DR
The paper tackles the challenge of integrating large language models (LLMs) into autonomous driving by addressing their lack of direct environmental perception. It proposes a human-centric framework where LLMs act as the decision-making brain while perception, localization, in-cabin monitoring, and memory provide sensory data and context, enabling safe and transparent reasoning. The work surveys techniques such as parameter-efficient fine-tuning (e.g., LoRA), reinforcement learning from human feedback (RLHF), and advanced prompting, and discusses recent advancements in embodied language models and zero-shot planning. A ChatGPT-4–driven case study demonstrates decision-making and motion planning in a complex overtaking scenario, highlighting improved interpretability, personalization, and trust. Collectively, the findings suggest that LLMs can significantly enhance autonomous vehicles by enabling natural language interactions, continuous learning, and adaptable decision-making that aligns with safety and user preferences.
Abstract
The future of autonomous vehicles lies in the convergence of human-centric design and advanced AI capabilities. Autonomous vehicles of the future will not only transport passengers but also interact and adapt to their desires, making the journey comfortable, efficient, and pleasant. In this paper, we present a novel framework that leverages Large Language Models (LLMs) to enhance autonomous vehicles' decision-making processes. By integrating LLMs' natural language capabilities and contextual understanding, specialized tools usage, synergizing reasoning, and acting with various modules on autonomous vehicles, this framework aims to seamlessly integrate the advanced language and reasoning capabilities of LLMs into autonomous vehicles. The proposed framework holds the potential to revolutionize the way autonomous vehicles operate, offering personalized assistance, continuous learning, and transparent decision-making, ultimately contributing to safer and more efficient autonomous driving technologies.
