Table of Contents
Fetching ...

From Words to Wheels: Automated Style-Customized Policy Generation for Autonomous Driving

Xu Han, Xianda Chen, Zhenghan Cai, Pinlong Cai, Meixin Zhu, Xiaowen Chu

TL;DR

The paper addresses aligning autonomous driving policies with natural language user commands to enable style customization. It introduces Words2Wheels, which uses a Style-Customized Reward Function and a Driving Style Database to generate Style Policies via RL, guided by Retrieval-Augmented Generation from LLMs and a Statistical Evaluation module that measures alignment to commands. The approach enables policy generation without relying on extensive human driving data and supports generalization to new commands through zero-shot style adaptation. Experimental results on car-following tasks show improved accuracy, generalization, and adaptability compared to baselines, highlighting practical potential for customized AV behavior.

Abstract

Autonomous driving technology has witnessed rapid advancements, with foundation models improving interactivity and user experiences. However, current autonomous vehicles (AVs) face significant limitations in delivering command-based driving styles. Most existing methods either rely on predefined driving styles that require expert input or use data-driven techniques like Inverse Reinforcement Learning to extract styles from driving data. These approaches, though effective in some cases, face challenges: difficulty obtaining specific driving data for style matching (e.g., in Robotaxis), inability to align driving style metrics with user preferences, and limitations to pre-existing styles, restricting customization and generalization to new commands. This paper introduces Words2Wheels, a framework that automatically generates customized driving policies based on natural language user commands. Words2Wheels employs a Style-Customized Reward Function to generate a Style-Customized Driving Policy without relying on prior driving data. By leveraging large language models and a Driving Style Database, the framework efficiently retrieves, adapts, and generalizes driving styles. A Statistical Evaluation module ensures alignment with user preferences. Experimental results demonstrate that Words2Wheels outperforms existing methods in accuracy, generalization, and adaptability, offering a novel solution for customized AV driving behavior. Code and demo available at https://yokhon.github.io/Words2Wheels/.

From Words to Wheels: Automated Style-Customized Policy Generation for Autonomous Driving

TL;DR

The paper addresses aligning autonomous driving policies with natural language user commands to enable style customization. It introduces Words2Wheels, which uses a Style-Customized Reward Function and a Driving Style Database to generate Style Policies via RL, guided by Retrieval-Augmented Generation from LLMs and a Statistical Evaluation module that measures alignment to commands. The approach enables policy generation without relying on extensive human driving data and supports generalization to new commands through zero-shot style adaptation. Experimental results on car-following tasks show improved accuracy, generalization, and adaptability compared to baselines, highlighting practical potential for customized AV behavior.

Abstract

Autonomous driving technology has witnessed rapid advancements, with foundation models improving interactivity and user experiences. However, current autonomous vehicles (AVs) face significant limitations in delivering command-based driving styles. Most existing methods either rely on predefined driving styles that require expert input or use data-driven techniques like Inverse Reinforcement Learning to extract styles from driving data. These approaches, though effective in some cases, face challenges: difficulty obtaining specific driving data for style matching (e.g., in Robotaxis), inability to align driving style metrics with user preferences, and limitations to pre-existing styles, restricting customization and generalization to new commands. This paper introduces Words2Wheels, a framework that automatically generates customized driving policies based on natural language user commands. Words2Wheels employs a Style-Customized Reward Function to generate a Style-Customized Driving Policy without relying on prior driving data. By leveraging large language models and a Driving Style Database, the framework efficiently retrieves, adapts, and generalizes driving styles. A Statistical Evaluation module ensures alignment with user preferences. Experimental results demonstrate that Words2Wheels outperforms existing methods in accuracy, generalization, and adaptability, offering a novel solution for customized AV driving behavior. Code and demo available at https://yokhon.github.io/Words2Wheels/.
Paper Structure (16 sections, 1 equation, 7 figures, 4 tables)

This paper contains 16 sections, 1 equation, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Words2Wheels automatically generates customized driving policies in alignment with user commands by LLM-powered reward design.
  • Figure 2: (A) Workflow of Words2Wheels: When a natural language command is received, the system matches it with a style from the database. Style Reward generation and policy training run simultaneously in the backend, resulting in a new Style Policy that may outperform the existing one and replace it. (B) Driving Style Database: This repository stores Style Rewards (initially from both data-driven and human-designed methods), Style Policies, and their statistics. It manages the increasing variety of driving styles and supports the automated policy customization. (C) Statistical Evaluation Module: This module ensures that the generated driving styles closely align with user commands by evaluating them against natural driving behaviors.
  • Figure 3: Example of automated Style Policy generation.
  • Figure 4: Simplified example of Statistical Evaluation module.
  • Figure 5: Comparison of customized policies over style-aware metrics.
  • ...and 2 more figures