Instruct Large Language Models to Drive like Humans

Ruijun Zhang; Xianda Guo; Wenzhao Zheng; Chenming Zhang; Kurt Keutzer; Long Chen

Instruct Large Language Models to Drive like Humans

Ruijun Zhang, Xianda Guo, Wenzhao Zheng, Chenming Zhang, Kurt Keutzer, Long Chen

TL;DR

The paper tackles aligning large language model planners with human driving behavior for motion planning. It introduces InstructDriver, which uses explicit driving instructions and an interpretable InstructChain to provide an interpretable chain-of-thought for planning, and leverages LoRA fine-tuning on a large LLM within a nuPlan-based closed-loop evaluation. The contributions include the InstructDriver framework, the InstructChain module, and extensive open-loop and closed-loop evaluations demonstrating competitive performance and improved interpretability. The work highlights that diverse training data enhances planning quality, while also acknowledging the need for lighter models to enable real-time deployment.

Abstract

Motion planning in complex scenarios is the core challenge in autonomous driving. Conventional methods apply predefined rules or learn from driving data to plan the future trajectory. Recent methods seek the knowledge preserved in large language models (LLMs) and apply them in the driving scenarios. Despite the promising results, it is still unclear whether the LLM learns the underlying human logic to drive. In this paper, we propose an InstructDriver method to transform LLM into a motion planner with explicit instruction tuning to align its behavior with humans. We derive driving instruction data based on human logic (e.g., do not cause collisions) and traffic rules (e.g., proceed only when green lights). We then employ an interpretable InstructChain module to further reason the final planning reflecting the instructions. Our InstructDriver allows the injection of human rules and learning from driving data, enabling both interpretability and data scalability. Different from existing methods that experimented on closed-loop or simulated settings, we adopt the real-world closed-loop motion planning nuPlan benchmark for better evaluation. InstructDriver demonstrates the effectiveness of the LLM planner in a real-world closed-loop setting. Our code is publicly available at https://github.com/bonbon-rj/InstructDriver.

Instruct Large Language Models to Drive like Humans

TL;DR

Abstract

Paper Structure (15 sections, 11 equations, 4 figures, 3 tables)

This paper contains 15 sections, 11 equations, 4 figures, 3 tables.

Introduction
Related Work
Proposed Approach
Motion Planning as Language Modeling
Instruction-based Behavior Alignment
Chain of Instructions
InstrucDriver
Experiment
Dataset
Evaluation Metrics
Implementation details
Main Results
Ablation and Analysis
Limitations
Conclusion

Figures (4)

Figure 1: The motivation of our InstructDriver. The left figure compares different motion planning methods for autonomous driving, showcasing our method's ability to function without predefined objectives. It highlights how our method guides the planner to produce human-like driving behavior. The right figure illustrates the correspondence between the provided instructions and the resulting outputs.
Figure 2: Overview of the motion planning process of our method InstructDriver. Our approach transforms scenario data into textual descriptions and, by setting specific instructions, enables a fine-tuned LLM to generate InstructChain and trajectories that align with human driving behavior. The trajectory is subsequently applied in a simulated environment.
Figure 3: Comparison of simulation results with varying temporality and data volumes, with M denoting millions.
Figure 4: The illustration of the planning process, including specific scenarios and corresponding InstructChain, demonstrates that the planner can generate plans that align with human driving behavior based on given instructions.

Instruct Large Language Models to Drive like Humans

TL;DR

Abstract

Instruct Large Language Models to Drive like Humans

Authors

TL;DR

Abstract

Table of Contents

Figures (4)