Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving

Mehdi Azarafza; Mojtaba Nayyeri; Charles Steinmetz; Steffen Staab; Achim Rettberg

Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving

Mehdi Azarafza, Mojtaba Nayyeri, Charles Steinmetz, Steffen Staab, Achim Rettberg

TL;DR

The paper addresses the challenge of generalizing LLM-based reasoning to real-time autonomous driving by introducing a hybrid reasoning pipeline that fuses perception data from YOLOv8 with sensor inputs into a GPT-4–style LLM to produce brake and throttle commands in CARLA. It evaluates nine scenarios under three weather conditions using three reasoning modes—common-sense, arithmetic, and hybrid—with hybrid reasoning yielding the strongest performance, averaging over $65\%$ accuracy and providing precise run-time control trajectories. The work demonstrates that LLMs can be structured to reason about dynamic driving contexts and generate actionable, scenario-specific control signals, potentially augmenting autopilot systems where traditional methods struggle under low-visibility or complex environments. It also highlights practical considerations such as latency and the need for domain-specific lightweight LLMs to enable real-time deployment, outlining future directions to optimize inputs and improve run-time efficiency. The study contributes a concrete framework for incorporating mathematical and commonsense reasoning in autonomous driving, with quantified benefits in a high-fidelity simulator and implications for real-world decision-making under adverse conditions.

Abstract

Large Language Models (LLMs) have garnered significant attention for their ability to understand text and images, generate human-like text, and perform complex reasoning tasks. However, their ability to generalize this advanced reasoning with a combination of natural language text for decision-making in dynamic situations requires further exploration. In this study, we investigate how well LLMs can adapt and apply a combination of arithmetic and common-sense reasoning, particularly in autonomous driving scenarios. We hypothesize that LLMs hybrid reasoning abilities can improve autonomous driving by enabling them to analyze detected object and sensor data, understand driving regulations and physical laws, and offer additional context. This addresses complex scenarios, like decisions in low visibility (due to weather conditions), where traditional methods might fall short. We evaluated Large Language Models (LLMs) based on accuracy by comparing their answers with human-generated ground truth inside CARLA. The results showed that when a combination of images (detected objects) and sensor data is fed into the LLM, it can offer precise information for brake and throttle control in autonomous vehicles across various weather conditions. This formulation and answers can assist in decision-making for auto-pilot systems.

Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving

TL;DR

accuracy and providing precise run-time control trajectories. The work demonstrates that LLMs can be structured to reason about dynamic driving contexts and generate actionable, scenario-specific control signals, potentially augmenting autopilot systems where traditional methods struggle under low-visibility or complex environments. It also highlights practical considerations such as latency and the need for domain-specific lightweight LLMs to enable real-time deployment, outlining future directions to optimize inputs and improve run-time efficiency. The study contributes a concrete framework for incorporating mathematical and commonsense reasoning in autonomous driving, with quantified benefits in a high-fidelity simulator and implications for real-world decision-making under adverse conditions.

Abstract

Paper Structure (15 sections, 16 equations, 5 figures, 1 table)

This paper contains 15 sections, 16 equations, 5 figures, 1 table.

Introduction
Related work
Generalization of Large Language Models in Autonomous Car Driving
Reinforcement Learning for Autonomous Car Driving
Background
CARLA
Object detection
Large language model
LLM-based reasoning
Assumption
Common-sense reasoning
Arithmetic reasoning
Combination of common-sense and arithmetic reasoning
Evaluation
Conclusion

Figures (5)

Figure 1: Workflow of hybrid reasoning of LLM in CARLA
Figure 2: Detected objects in two different kinds of weather in common-sense reasoning
Figure 3: Detected 'Person' in front of the car
Figure 4: Comparing accuracy in different weather conditions
Figure 5: Comparison of Number of Correct and Incorrect Answers Using Three Methods a) Common-Sense b) Arithmetic c) Hybrid

Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving

TL;DR

Abstract

Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (5)