Table of Contents
Fetching ...

Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles

Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang

TL;DR

This work proposes a human-centric framework that embeds Large Language Models (LLMs) as the decision-making core of autonomous vehicles, using perception, localization, and in-cabin monitoring as sensory inputs and vehicle controllers as actuators. Through GPT-4 driven experiments in HighwayEnv, it demonstrates that chain-of-thought prompting and in-context learning can enhance safety, transparency, and personalization, enabling real-time verbal guidance to shape driving behavior. The study shows that LLMs can reason over multi-source sensor data, provide explanations for decisions, and adapt driving styles based on user commands, suggesting a path toward more intuitive and trustworthy AV systems. It also outlines future directions including task-specific fine-tuning, benchmarks, and addressing ethical and societal implications.

Abstract

The fusion of human-centric design and artificial intelligence (AI) capabilities has opened up new possibilities for next-generation autonomous vehicles that go beyond transportation. These vehicles can dynamically interact with passengers and adapt to their preferences. This paper proposes a novel framework that leverages Large Language Models (LLMs) to enhance the decision-making process in autonomous vehicles. By utilizing LLMs' linguistic and contextual understanding abilities with specialized tools, we aim to integrate the language and reasoning capabilities of LLMs into autonomous vehicles. Our research includes experiments in HighwayEnv, a collection of environments for autonomous driving and tactical decision-making tasks, to explore LLMs' interpretation, interaction, and reasoning in various scenarios. We also examine real-time personalization, demonstrating how LLMs can influence driving behaviors based on verbal commands. Our empirical results highlight the substantial advantages of utilizing chain-of-thought prompting, leading to improved driving decisions, and showing the potential for LLMs to enhance personalized driving experiences through ongoing verbal feedback. The proposed framework aims to transform autonomous vehicle operations, offering personalized support, transparent decision-making, and continuous learning to enhance safety and effectiveness. We achieve user-centric, transparent, and adaptive autonomous driving ecosystems supported by the integration of LLMs into autonomous vehicles.

Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles

TL;DR

This work proposes a human-centric framework that embeds Large Language Models (LLMs) as the decision-making core of autonomous vehicles, using perception, localization, and in-cabin monitoring as sensory inputs and vehicle controllers as actuators. Through GPT-4 driven experiments in HighwayEnv, it demonstrates that chain-of-thought prompting and in-context learning can enhance safety, transparency, and personalization, enabling real-time verbal guidance to shape driving behavior. The study shows that LLMs can reason over multi-source sensor data, provide explanations for decisions, and adapt driving styles based on user commands, suggesting a path toward more intuitive and trustworthy AV systems. It also outlines future directions including task-specific fine-tuning, benchmarks, and addressing ethical and societal implications.

Abstract

The fusion of human-centric design and artificial intelligence (AI) capabilities has opened up new possibilities for next-generation autonomous vehicles that go beyond transportation. These vehicles can dynamically interact with passengers and adapt to their preferences. This paper proposes a novel framework that leverages Large Language Models (LLMs) to enhance the decision-making process in autonomous vehicles. By utilizing LLMs' linguistic and contextual understanding abilities with specialized tools, we aim to integrate the language and reasoning capabilities of LLMs into autonomous vehicles. Our research includes experiments in HighwayEnv, a collection of environments for autonomous driving and tactical decision-making tasks, to explore LLMs' interpretation, interaction, and reasoning in various scenarios. We also examine real-time personalization, demonstrating how LLMs can influence driving behaviors based on verbal commands. Our empirical results highlight the substantial advantages of utilizing chain-of-thought prompting, leading to improved driving decisions, and showing the potential for LLMs to enhance personalized driving experiences through ongoing verbal feedback. The proposed framework aims to transform autonomous vehicle operations, offering personalized support, transparent decision-making, and continuous learning to enhance safety and effectiveness. We achieve user-centric, transparent, and adaptive autonomous driving ecosystems supported by the integration of LLMs into autonomous vehicles.
Paper Structure (23 sections, 5 figures, 1 table)

This paper contains 23 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: The human-centric LLMs-integrated framework for autonomous vehicles
  • Figure 2: LLMs' decision-making workflows using both standard prompting and chain-of-thought prompting in the highway scenario
  • Figure 3: Visualization of experiment results for safe overtaking scenarios in the highway environment, where the green vehicle is the ego vehicle
  • Figure 4: Visualization of experiment results for unsafe overtaking scenarios in the highway environment, where the green vehicle is the ego vehicle
  • Figure 5: Visualization of experiment results for merging scenarios in the highway environment, where the yellow vehicle is the ego vehicle