Table of Contents
Fetching ...

TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning

Chengkai Xu, Jiaqi Liu, Shiyu Fang, Yiming Cui, Dong Chen, Peng Hang, Jian Sun

TL;DR

TeLL-Drive addresses the data-efficiency and real-time decision challenges of autonomous driving by coupling a risk-aware Teacher LLM with a self-attention DRL Student. The framework uses memory, reflective evaluation, and chain-of-thought prompts to guide high-level decision making, while a KL-constrained, attention-based fusion enables safe, efficient learning and robust execution. Empirical results across diverse driving scenarios show superior performance and stability compared with traditional DRL and prior LLM-based methods, with ablations confirming the importance of teacher guidance and the attention fusion. Vehicle-in-the-loop experiments further validate real-time feasibility, robustness, and reliability in near-real-world settings, underscoring the practical potential of hybrid LLM-DRL approaches for autonomous driving.

Abstract

Although Deep Reinforcement Learning (DRL) and Large Language Models (LLMs) each show promise in addressing decision-making challenges in autonomous driving, DRL often suffers from high sample complexity, while LLMs have difficulty ensuring real-time decision making. To address these limitations, we propose TeLL-Drive, a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy. By incorporating risk metrics, historical scenario retrieval, and domain heuristics into context-rich prompts, the LLM produces high-level driving strategies through chain-of-thought reasoning. A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness across diverse driving conditions. The experimental results, evaluated across multiple traffic scenarios, show that TeLL-Drive outperforms existing baseline methods, including other LLM-based approaches, in terms of success rates, average returns, and real-time feasibility. Ablation studies underscore the importance of each model component, especially the synergy between the attention mechanism and LLM-driven guidance. Finally, we build a virtual-real fusion experimental platform to verify the real-time performance, robustness, and reliability of the algorithm running on real vehicles through vehicle-in-loop experiments.

TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning

TL;DR

TeLL-Drive addresses the data-efficiency and real-time decision challenges of autonomous driving by coupling a risk-aware Teacher LLM with a self-attention DRL Student. The framework uses memory, reflective evaluation, and chain-of-thought prompts to guide high-level decision making, while a KL-constrained, attention-based fusion enables safe, efficient learning and robust execution. Empirical results across diverse driving scenarios show superior performance and stability compared with traditional DRL and prior LLM-based methods, with ablations confirming the importance of teacher guidance and the attention fusion. Vehicle-in-the-loop experiments further validate real-time feasibility, robustness, and reliability in near-real-world settings, underscoring the practical potential of hybrid LLM-DRL approaches for autonomous driving.

Abstract

Although Deep Reinforcement Learning (DRL) and Large Language Models (LLMs) each show promise in addressing decision-making challenges in autonomous driving, DRL often suffers from high sample complexity, while LLMs have difficulty ensuring real-time decision making. To address these limitations, we propose TeLL-Drive, a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy. By incorporating risk metrics, historical scenario retrieval, and domain heuristics into context-rich prompts, the LLM produces high-level driving strategies through chain-of-thought reasoning. A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness across diverse driving conditions. The experimental results, evaluated across multiple traffic scenarios, show that TeLL-Drive outperforms existing baseline methods, including other LLM-based approaches, in terms of success rates, average returns, and real-time feasibility. Ablation studies underscore the importance of each model component, especially the synergy between the attention mechanism and LLM-driven guidance. Finally, we build a virtual-real fusion experimental platform to verify the real-time performance, robustness, and reliability of the algorithm running on real vehicles through vehicle-in-loop experiments.

Paper Structure

This paper contains 29 sections, 20 equations, 10 figures, 2 tables, 1 algorithm.

Figures (10)

  • Figure 1: The LLM teacher guides the DRL agent in decision-making within complex traffic scenarios, offering corrective feedback during exploration to enhance learning efficiency and decision-making accuracy.
  • Figure 2: The overall conceptual framework of TeLL-Drive, where a DRL student agent is guided by the LLM teacher for better decision making in autonomous driving.
  • Figure 3: Proposed policy network with self-attention layer. The network integrates self-attention to estimate action probabilities and value functions from the teacher’s strategy, enabling strategy distillation and a balance between teacher guidance and self-exploration.
  • Figure 4: The designed gradient verification scenario for simulation: (a) Unsignalized Intersection; (b) High-Speed Ramp Merging; (c) Four-Lane Adaptive Cruise.
  • Figure 5: Comparison of the performance of this model with traditional DRL training results.
  • ...and 5 more figures