Table of Contents
Fetching ...

Optimizing Autonomous Driving for Safety: A Human-Centric Approach with LLM-Enhanced RLHF

Yuan Sun, Navid Salami Pargoo, Peter J. Jin, Jorge Ortiz

TL;DR

This research innovatively combine RLHF and LLMs to enhance autonomous driving safety and implements multiple human-controlled agents, such as cars and pedestrians, to simulate real-life road environments.

Abstract

Reinforcement Learning from Human Feedback (RLHF) is popular in large language models (LLMs), whereas traditional Reinforcement Learning (RL) often falls short. Current autonomous driving methods typically utilize either human feedback in machine learning, including RL, or LLMs. Most feedback guides the car agent's learning process (e.g., controlling the car). RLHF is usually applied in the fine-tuning step, requiring direct human "preferences," which are not commonly used in optimizing autonomous driving models. In this research, we innovatively combine RLHF and LLMs to enhance autonomous driving safety. Training a model with human guidance from scratch is inefficient. Our framework starts with a pre-trained autonomous car agent model and implements multiple human-controlled agents, such as cars and pedestrians, to simulate real-life road environments. The autonomous car model is not directly controlled by humans. We integrate both physical and physiological feedback to fine-tune the model, optimizing this process using LLMs. This multi-agent interactive environment ensures safe, realistic interactions before real-world application. Finally, we will validate our model using data gathered from real-life testbeds located in New Jersey and New York City.

Optimizing Autonomous Driving for Safety: A Human-Centric Approach with LLM-Enhanced RLHF

TL;DR

This research innovatively combine RLHF and LLMs to enhance autonomous driving safety and implements multiple human-controlled agents, such as cars and pedestrians, to simulate real-life road environments.

Abstract

Reinforcement Learning from Human Feedback (RLHF) is popular in large language models (LLMs), whereas traditional Reinforcement Learning (RL) often falls short. Current autonomous driving methods typically utilize either human feedback in machine learning, including RL, or LLMs. Most feedback guides the car agent's learning process (e.g., controlling the car). RLHF is usually applied in the fine-tuning step, requiring direct human "preferences," which are not commonly used in optimizing autonomous driving models. In this research, we innovatively combine RLHF and LLMs to enhance autonomous driving safety. Training a model with human guidance from scratch is inefficient. Our framework starts with a pre-trained autonomous car agent model and implements multiple human-controlled agents, such as cars and pedestrians, to simulate real-life road environments. The autonomous car model is not directly controlled by humans. We integrate both physical and physiological feedback to fine-tune the model, optimizing this process using LLMs. This multi-agent interactive environment ensures safe, realistic interactions before real-world application. Finally, we will validate our model using data gathered from real-life testbeds located in New Jersey and New York City.
Paper Structure (11 sections, 1 equation, 6 figures)

This paper contains 11 sections, 1 equation, 6 figures.

Figures (6)

  • Figure 1: Overview of our human-centric multi-agent LLM-enhanced RLHF system framework. During the fine-tuning of an autonomous car model, human agents and proliferated LLM agents mimicking multiple human behaviors are incorporated into the environment to align with real-world human preferences.
  • Figure 2: Simulation room with VR headset, steering controls, and monitors for real-time multimodal data collection and autonomous driving optimization.
  • Figure 3: Example of GPT-4o imitating a human agent in the CARLA simulation. The LLM agent is attempting to overtake the car in front in a human-like manner.
  • Figure 4: Example of the LLM agent guiding the autonomous car agent to reverse away from a collision with a building.
  • Figure 5: Example of the LLM agent assisting the human agent in using the simulation.
  • ...and 1 more figures