HighwayLLM: Decision-Making and Navigation in Highway Driving with RL-Informed Language Model
Mustafa Yildirim, Barkin Dagda, Saber Fallah
TL;DR
HighwayLLM addresses explainability and safety in highway autonomous driving by integrating a pre-trained RL planner with an LLM-based trajectory planner that retrieves similar past trajectories via FAISS and uses function-calling to output safe next states. The RL component leverages a Deep Q-Network to maximize expected return, while the LLM generates the next three waypoints subject to safety constraints, tracked by a PID controller and guided by IDM-based longitudinal dynamics; key formulas include $R(\pi,r)=\mathbb{E}_{\pi}[\sum_{t=0}^{\infty} \gamma^{t} r(s_t,a_t)]$ and $u(t)= k_p e(t) + k_i \int_0^t e(\tau) d\tau + k_d \frac{d}{dt} e(t)$. Experiments on the highD dataset show that using the LLM for trajectory planning or as a safety layer reduces collisions and yields higher average speeds compared to a baseline RL model, at the cost of increased inference time. Overall, the work demonstrates a viable, explainable, multi-modal planning framework that combines RL, LLM reasoning, and traditional control for highway driving, while highlighting areas for real-time feasibility and mitigation of LLM hallucinations.
Abstract
Autonomous driving is a complex task which requires advanced decision making and control algorithms. Understanding the rationale behind the autonomous vehicles' decision is crucial to ensure their safe and effective operation on highway driving. This study presents a novel approach, HighwayLLM, which harnesses the reasoning capabilities of large language models (LLMs) to predict the future waypoints for ego-vehicle's navigation. Our approach also utilizes a pre-trained Reinforcement Learning (RL) model to serve as a high-level planner, making decisions on appropriate meta-level actions. The HighwayLLM combines the output from the RL model and the current state information to make safe, collision-free, and explainable predictions for the next states, thereby constructing a trajectory for the ego-vehicle. Subsequently, a PID-based controller guides the vehicle to the waypoints predicted by the LLM agent. This integration of LLM with RL and PID enhances the decision-making process and provides interpretability for highway autonomous driving.
