Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models
Nicolas Baumann, Cheng Hu, Paviththiren Sivasothilingam, Haotong Qin, Lei Xie, Michele Magno, Luca Benini
TL;DR
This work tackles edge-case robustness in autonomous driving by fusing knowledge-driven reasoning from locally deployed LLMs with a safety-guaranteed MPC. The proposed DecisionxLLM and MPCxLLM modules reason over robotic data using natural language prompts and dynamically tune cost/constraint parameters, all on-board with RAG, LoRA finetuning, and quantization to meet real-time constraints. Key contributions include improved reasoning accuracy (up to 10.45%), heightened controller adaptability (up to 52.2%), and up to a 10.5x increase in onboard inference throughput on resource-constrained hardware, demonstrated on a 1:10 scale robotic platform. The approach enables a practical, privacy-preserving, edge-enabled ADS that bridges high-level decision-making with low-level control adaptation, with open-source code for broader adoption.
Abstract
Neural Networks (NNs) trained through supervised learning struggle with managing edge-case scenarios common in real-world driving due to the intractability of exhaustive datasets covering all edge-cases, making knowledge-driven approaches, akin to how humans intuitively detect unexpected driving behavior, a suitable complement to data-driven methods. This work proposes a hybrid architecture combining low-level Model Predictive Controller (MPC) with locally deployed Large Language Models (LLMs) to enhance decision-making and Human Machine Interaction (HMI). The DecisionxLLM module evaluates robotic state information against natural language instructions to ensure adherence to desired driving behavior. The MPCxLLM module then adjusts MPC parameters based on LLM-generated insights, achieving control adaptability while preserving the safety and constraint guarantees of traditional MPC systems. Further, to enable efficient on-board deployment and to eliminate dependency on cloud connectivity, we shift processing to the on-board computing platform: We propose an approach that exploits Retrieval Augmented Generation (RAG), Low Rank Adaptation (LoRA) fine-tuning, and quantization. Experimental results demonstrate that these enhancements yield significant improvements in reasoning accuracy by up to 10.45%, control adaptability by as much as 52.2%, and up to 10.5x increase in computational efficiency (tokens/s), validating the proposed framework's practicality for real-time deployment even on down-scaled robotic platforms. This work bridges high-level decision-making with low-level control adaptability, offering a synergistic framework for knowledge-driven and adaptive Autonomous Driving Systems (ADS).
