Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles

Can Cui; Yunsheng Ma; Xu Cao; Wenqian Ye; Ziran Wang

Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles

Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang

TL;DR

The paper tackles the challenge of integrating large language models (LLMs) into autonomous driving by addressing their lack of direct environmental perception. It proposes a human-centric framework where LLMs act as the decision-making brain while perception, localization, in-cabin monitoring, and memory provide sensory data and context, enabling safe and transparent reasoning. The work surveys techniques such as parameter-efficient fine-tuning (e.g., LoRA), reinforcement learning from human feedback (RLHF), and advanced prompting, and discusses recent advancements in embodied language models and zero-shot planning. A ChatGPT-4–driven case study demonstrates decision-making and motion planning in a complex overtaking scenario, highlighting improved interpretability, personalization, and trust. Collectively, the findings suggest that LLMs can significantly enhance autonomous vehicles by enabling natural language interactions, continuous learning, and adaptable decision-making that aligns with safety and user preferences.

Abstract

The future of autonomous vehicles lies in the convergence of human-centric design and advanced AI capabilities. Autonomous vehicles of the future will not only transport passengers but also interact and adapt to their desires, making the journey comfortable, efficient, and pleasant. In this paper, we present a novel framework that leverages Large Language Models (LLMs) to enhance autonomous vehicles' decision-making processes. By integrating LLMs' natural language capabilities and contextual understanding, specialized tools usage, synergizing reasoning, and acting with various modules on autonomous vehicles, this framework aims to seamlessly integrate the advanced language and reasoning capabilities of LLMs into autonomous vehicles. The proposed framework holds the potential to revolutionize the way autonomous vehicles operate, offering personalized assistance, continuous learning, and transparent decision-making, ultimately contributing to safer and more efficient autonomous driving technologies.

Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles

TL;DR

Abstract

Paper Structure (7 sections, 3 figures)

This paper contains 7 sections, 3 figures.

Introduction
Perspective: the Role of LLMs in Advancing Autonomous Vehicles
Review: Can LLMs Really Do This?
Adaptive Techniques and Human-Centric Refinements for LLMs
Advancements in LLMs: Implications for Autonomous Driving Decision-Making
Experiment: Decision-Making and Motion Planning with ChatGPT-4
Conclusion

Figures (3)

Figure 1: The human-centric LLM-integrated framework for autonomous vehicles.
Figure 2: General Q&A with ChatGPT-4 regarding autonomous vehicles.
Figure 3: Experiment illustrating LLM-assisted decision-making and motion planning in a complex driving scenario. The Ego vehicle and its trajectory are marked orange; The vehicle ahead in the current lane and its trajectory are blue; The vehicles in adjacent lanes and their trajectories are green.

Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles

TL;DR

Abstract

Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles

Authors

TL;DR

Abstract

Table of Contents

Figures (3)