Table of Contents
Fetching ...

StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving

Ruiyang Hao, Bowen Jing, Haibao Yu, Zaiqing Nie

TL;DR

StyleDrive addresses the lack of personalization in end-to-end autonomous driving by introducing a large real-world dataset and a standardized benchmark tailored for driving-style conditioning. The authors combine map topology, dynamic semantics inferred by a fine-tuned vision-language model, rule-and-distribution-based heuristics, and human-in-the-loop verification to annotate both objective driving behaviors and subjective driving style preferences. They then establish the StyleDrive Benchmark with a Style-Modulated PDMS metric (SM-PDMS) to evaluate how closely a policy aligns with target driving styles while maintaining safety, tested across multiple state-of-the-art models. Results show that incorporating driving preferences substantially improves behavioral alignment with human demonstrations, highlighting the value of style-conditioned E2EAD for trust, safety, and real-world adoption.

Abstract

Personalization, while extensively studied in conventional autonomous driving pipelines, has been largely overlooked in the context of end-to-end autonomous driving (E2EAD), despite its critical role in fostering user trust, safety perception, and real-world adoption. A primary bottleneck is the absence of large-scale real-world datasets that systematically capture driving preferences, severely limiting the development and evaluation of personalized E2EAD models. In this work, we introduce the first large-scale real-world dataset explicitly curated for personalized E2EAD, integrating comprehensive scene topology with rich dynamic context derived from agent dynamics and semantics inferred via a fine-tuned vision-language model (VLM). We propose a hybrid annotation pipeline that combines behavioral analysis, rule-and-distribution-based heuristics, and subjective semantic modeling guided by VLM reasoning, with final refinement through human-in-the-loop verification. Building upon this dataset, we introduce the first standardized benchmark for systematically evaluating personalized E2EAD models. Empirical evaluations on state-of-the-art architectures demonstrate that incorporating personalized driving preferences significantly improves behavioral alignment with human demonstrations.

StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving

TL;DR

StyleDrive addresses the lack of personalization in end-to-end autonomous driving by introducing a large real-world dataset and a standardized benchmark tailored for driving-style conditioning. The authors combine map topology, dynamic semantics inferred by a fine-tuned vision-language model, rule-and-distribution-based heuristics, and human-in-the-loop verification to annotate both objective driving behaviors and subjective driving style preferences. They then establish the StyleDrive Benchmark with a Style-Modulated PDMS metric (SM-PDMS) to evaluate how closely a policy aligns with target driving styles while maintaining safety, tested across multiple state-of-the-art models. Results show that incorporating driving preferences substantially improves behavioral alignment with human demonstrations, highlighting the value of style-conditioned E2EAD for trust, safety, and real-world adoption.

Abstract

Personalization, while extensively studied in conventional autonomous driving pipelines, has been largely overlooked in the context of end-to-end autonomous driving (E2EAD), despite its critical role in fostering user trust, safety perception, and real-world adoption. A primary bottleneck is the absence of large-scale real-world datasets that systematically capture driving preferences, severely limiting the development and evaluation of personalized E2EAD models. In this work, we introduce the first large-scale real-world dataset explicitly curated for personalized E2EAD, integrating comprehensive scene topology with rich dynamic context derived from agent dynamics and semantics inferred via a fine-tuned vision-language model (VLM). We propose a hybrid annotation pipeline that combines behavioral analysis, rule-and-distribution-based heuristics, and subjective semantic modeling guided by VLM reasoning, with final refinement through human-in-the-loop verification. Building upon this dataset, we introduce the first standardized benchmark for systematically evaluating personalized E2EAD models. Empirical evaluations on state-of-the-art architectures demonstrate that incorporating personalized driving preferences significantly improves behavioral alignment with human demonstrations.

Paper Structure

This paper contains 92 sections, 1 equation, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Motivation and Overview of StyleDrive.
  • Figure 2: Framework for Modeling and Annotation of Driving Preference.
  • Figure 3: Dataset Statistics and Distribution Analysis.
  • Figure 4: Visualization of driving style distribution in three typical scenarios. Each case is drawn from similar local scenes without pedestrians or leading cars, ensuring style differences arise primarily from drivers' own behavioral preferences. Red trajectories denote aggressive and blue ones denote conservative. More demos are provided in Sect.2 of the Supplementary.
  • Figure 5: Qualitative illustration of DiffusionDrive-Style predictions under different style conditions across identical scenarios. Left: Aggressive (A) vs. Normal (N); Right: Conservative (C) vs. Normal (N). Red lines indicate the model's predicted trajectory under the given style condition; green lines denote the ground-truth human trajectory. Clear behavioral differences emerge with style variation, reflecting the model’s ability to adapt its outputs to driving preferences.
  • ...and 2 more figures