LA-RL: Language Action-guided Reinforcement Learning with Safety Guarantees for Autonomous Highway Driving

Yiming Shu; Jiahui Xu; Jiwei Tang; Ruiyang Gao; Chen Sun

LA-RL: Language Action-guided Reinforcement Learning with Safety Guarantees for Autonomous Highway Driving

Yiming Shu, Jiahui Xu, Jiwei Tang, Ruiyang Gao, Chen Sun

TL;DR

LA-RL addresses the safety-efficiency trade-off in autonomous highway driving by integrating a language-guided actor-critic with a safety-critical MPC-DCBF planner. It employs task-specific reward shaping and a slack-enabled planner to balance exploration with safety guarantees. Empirical results show LA-RL outperforms multiple state-of-the-art baselines, achieving up to ~20-30% higher success rates and 100% success in low-density scenarios, while maintaining stability and efficiency. This work demonstrates a practical approach to safe, interpretable, and proactive autonomous driving through language-guided decision making and formal safety constraints.

Abstract

Autonomous highway driving demands a critical balance between proactive, efficiency-seeking behavior and robust safety guarantees. This paper proposes Language Action-guided Reinforcement Learning (LA-RL) with Safety Guarantees, a novel framework that integrates the semantic reasoning of large language models (LLMs) into the actor-critic architecture with an improved safety layer. Within this framework, task-specific reward shaping harmonizes the dual objectives of maximizing driving efficiency and ensuring safety, guiding decision-making based on both environmental insights and clearly defined goals. To enhance safety, LA-RL incorporates a safety-critical planner that combines model predictive control (MPC) with discrete control barrier functions (DCBFs). This layer formally constrains the LLM-informed policy to a safe action set, employs a slack mechanism that enhances solution feasibility, prevents overly conservative behavior and allows for greater policy exploration without compromising safety. Extensive experiments demonstrate that it significantly outperforms several current state-of-the-art methods, offering a more adaptive, reliable, and robust solution for autonomous highway driving. Compared to existing SOTA, it achieves approximately 20$\%$ higher success rate than the knowledge graph (KG) based baseline and about 30$\%$ higher than the retrieval augmented generation (RAG) based baseline. In low-density environments, LA-RL achieves a 100$\%$ success rate. These results confirm its enhanced exploration of the state-action space and its ability to autonomously adopt more efficient, proactive strategies in complex, mixed-traffic highway environments.

LA-RL: Language Action-guided Reinforcement Learning with Safety Guarantees for Autonomous Highway Driving

TL;DR

Abstract

LA-RL: Language Action-guided Reinforcement Learning with Safety Guarantees for Autonomous Highway Driving

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)