Driving Everywhere with Large Language Model Policy Adaptation

Boyi Li; Yue Wang; Jiageng Mao; Boris Ivanovic; Sushant Veer; Karen Leung; Marco Pavone

Driving Everywhere with Large Language Model Policy Adaptation

Boyi Li, Yue Wang, Jiageng Mao, Boris Ivanovic, Sushant Veer, Karen Leung, Marco Pavone

TL;DR

The paper tackles the challenge of driving everywhere despite region-specific traffic laws by introducing LLaDA, a training-free pipeline that adapts nominal driving plans to local rules through a Traffic Rule Extractor (TRE) and a large language model (default GPT-4V). By extracting relevant passages from local handbooks and reasoning with a scene description and any unexpected events, LLaDA re-plans trajectories without modifying underlying perception or low-level controllers. The authors demonstrate improvements in cross-region motion planning on nuScenes/NuPlan data, supported by user studies and ablations, and show compatibility with GPT-Driver and GPT-4V for enhanced reasoning and vision inputs. While offering practical benefits for tourists and AV deployment beyond geo-fenced regions, the approach acknowledges runtime constraints and the need for AV-centric scene understanding and safety assurances in future work.

Abstract

Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs). In this paper, we present LLaDA, a simple yet powerful tool that enables human drivers and autonomous vehicles alike to drive everywhere by adapting their tasks and motion plans to traffic rules in new locations. LLaDA achieves this by leveraging the impressive zero-shot generalizability of large language models (LLMs) in interpreting the traffic rules in the local driver handbook. Through an extensive user study, we show that LLaDA's instructions are useful in disambiguating in-the-wild unexpected situations. We also demonstrate LLaDA's ability to adapt AV motion planning policies in real-world datasets; LLaDA outperforms baseline planning approaches on all our metrics. Please check our website for more details: https://boyiliee.github.io/llada.

Driving Everywhere with Large Language Model Policy Adaptation

TL;DR

Abstract

Paper Structure (14 sections, 9 figures, 2 tables)

This paper contains 14 sections, 9 figures, 2 tables.

Introduction
Related Works
Driving Everywhere with Large Language Model Policy Adaptation
Applications of LLaDA
Experiments
Implementation Details
LLaDA Examples
Inference on Random Nuscenes/Nuplan Videos
Challenging Situations
Evaluator-based Assessment.
Comparison on Motion Planning
Ablation Study on Potential Safety Issues
Combining with GPT-4V
Conclusion, Limitations, and Future Work

Figures (9)

Figure 1: LLaDA enables drivers to obtain instructions in any region all over the world. For instance, the driver gets a driver's license in California, USA, our system enables providing prompt instructions when the driver drives in different regions with different situations.
Figure 2: Overview of LLaDA. In this illustration, the driver learned how to drive in California but now needs to drive in New York City. However, the road situation, traffic code, and unexpected situations are different. In our system, we consider three inputs: initial plan ("Turn right"), unique traffic code in current location (New York City Driving Handbook), and unexpected situation ("someone honks at me"). We will feed these three inputs into a Traffic Rule Extractor (TRE), which aims to organize and filter the inputs and feed the output into the frozen LLMs to obtain the final new plan. In this paper, we set GPT-4 as our default LLM.
Figure 3: Details of Traffic Rule Extractor (TRE). As is shown in the figure, we first organize the information (such as locations, "Turn right" and "someone honks at me" ) into a prompt. Then we feed the prompt to find the one or two keywords using GPT-4. To guarantee the search quality, each keyword contains one or two words. Then we find the key paragraphs that contain extracted keywords in the unique traffic code. In this way, we could filter out the necessary information and only organize the valuable material into GPT-4 to obtain the final new plan.
Figure 4: Combining LLaDA with GPT-Driver for motion planning on the nuScenes dataset.
Figure 5: We show a few examples of LLaDA to help drivers drive everywhere with language policy. We show LLaDA could help the drivers obtain prompt notification and correct their corresponding behaviors in different countries with diverse plans and diverse unexpected situations. Also, it is obvious that LLM cannot provide accurate instruction based on each location without the background of the traffic code.
...and 4 more figures

Driving Everywhere with Large Language Model Policy Adaptation

TL;DR

Abstract

Driving Everywhere with Large Language Model Policy Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (9)