Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving

Jianbiao Mei; Yukai Ma; Xuemeng Yang; Licheng Wen; Xinyu Cai; Xin Li; Daocheng Fu; Bo Zhang; Pinlong Cai; Min Dou; Botian Shi; Liang He; Yong Liu; Yu Qiao

Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving

Jianbiao Mei, Yukai Ma, Xuemeng Yang, Licheng Wen, Xinyu Cai, Xin Li, Daocheng Fu, Bo Zhang, Pinlong Cai, Min Dou, Botian Shi, Liang He, Yong Liu, Yu Qiao

TL;DR

LeapAD introduces a dual-process closed-loop autonomous driving framework that mimics human cognition by coupling a scene-focused Vision-Language Model with a slow Analytic Process and a fast Heuristic Process. A transferable memory bank and a reflection mechanism enable continuous self-improvement in a CARLA-based environment, achieving data-efficient learning and strong performance improvements over camera-only baselines. The Analytic Process leverages world knowledge through an LLM to accumulate high-quality driving experiences, which are distilled into the Heuristic Process via supervised fine-tuning and few-shot prompting to enable rapid edge-deployed decisions. Empirical results show LeapAD surpasses several baselines with only $11{,}000$ fine-tuning examples for VLM and a memory bank of up to $18{,}000$ samples, achieving a driving score of $DS=83.11$ in Town05 while demonstrating robust cross-town generalization and continuous improvement through reflection.

Abstract

Autonomous driving has advanced significantly due to sensors, machine learning, and artificial intelligence improvements. However, prevailing methods struggle with intricate scenarios and causal relationships, hindering adaptability and interpretability in varied environments. To address the above problems, we introduce LeapAD, a novel paradigm for autonomous driving inspired by the human cognitive process. Specifically, LeapAD emulates human attention by selecting critical objects relevant to driving decisions, simplifying environmental interpretation, and mitigating decision-making complexities. Additionally, LeapAD incorporates an innovative dual-process decision-making module, which consists of an Analytic Process (System-II) for thorough analysis and reasoning, along with a Heuristic Process (System-I) for swift and empirical processing. The Analytic Process leverages its logical reasoning to accumulate linguistic driving experience, which is then transferred to the Heuristic Process by supervised fine-tuning. Through reflection mechanisms and a growing memory bank, LeapAD continuously improves itself from past mistakes in a closed-loop environment. Closed-loop testing in CARLA shows that LeapAD outperforms all methods relying solely on camera input, requiring 1-2 orders of magnitude less labeled data. Experiments also demonstrate that as the memory bank expands, the Heuristic Process with only 1.8B parameters can inherit the knowledge from a GPT-4 powered Analytic Process and achieve continuous performance improvement. Project page: https://pjlab-adg.github.io/LeapAD.

Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving

TL;DR

fine-tuning examples for VLM and a memory bank of up to

samples, achieving a driving score of

in Town05 while demonstrating robust cross-town generalization and continuous improvement through reflection.

Abstract

Paper Structure (36 sections, 2 equations, 18 figures, 5 tables)

This paper contains 36 sections, 2 equations, 18 figures, 5 tables.

Introduction
Related Works
Large Vision Language Models
Empowering Autonomous Driving with Foundation Models
From Data-Driven to Knowledge-driven Autonomous Driving
Methodology
Overview
Scene Understanding with VLM
Analytic Process
Heuristic Process
Experiments
Data preparation
Data for VLM.
Data for Heuristic Process.
Implementation Details
...and 21 more sections

Figures (18)

Figure 1: The detailed architecture of our proposed LeapAD. The scene understanding module analyzes surrounding images and provides descriptions of critical objects that may influence driving decisions. These scenario descriptions are then fed into the dual-process decision module, which drives reasoning and decision-making. The generated decisions are then transmitted to action executors, where they are converted into control signals for interaction with the simulator. The Analytic Process then uses an LLM to accumulate experience in driving analysis and decision-making, conducting reflections on accidents. The experience is stored in the memory bank and transferred to a lightweight language model, forming our Heuristic Process for quick responses and continuous learning.
Figure 2: Detailed procedure of the reflection mechanism. When Heuristic Process encounters traffic accidents, the Analytic Process intervenes, analyzing historical frames to pinpoint errors and provide corrected samples. These corrected samples are then integrated into the memory bank to facilitate continuous learning.
Figure 3: The illustration of the fine-tuning process. We fine-tune the VLM (Qwen-VL-7B) using 11K instruction-following data for scene understanding (left). Also, we utilize the collected samples in the memory bank to fine-tune Qwen-1.5 used in Heuristic Process, as illustrated in the right part.
Figure 4: The illustration for ablation studies of few-shot and memory size. See Appendix \ref{['sec:other_exp']} for the detailed data.
Figure 5: Effectiveness of the reflection mechanism. The $x$-axis represents the rounds of reflection, while the $y$-axis denotes the resulting driving score. The dashed line illustrates performances on different routes after multi-round reflection, and the red "average score" denotes the mean performance across all routes.
...and 13 more figures

Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving

TL;DR

Abstract

Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (18)