Table of Contents
Fetching ...

Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

Zehao Wang, Huaide Jiang, Shuaiwu Dong, Yuping Wang, Hang Qiu, Jiachen Li

Abstract

Human driving behavior is inherently personal, which is shaped by long-term habits and influenced by short-term intentions. Individuals differ in how they accelerate, brake, merge, yield, and overtake across diverse situations. However, existing end-to-end autonomous driving systems either optimize for generic objectives or rely on fixed driving modes, lacking the ability to adapt to individual preferences or interpret natural language intent. To address this gap, we propose Drive My Way (DMW), a personalized Vision-Language-Action (VLA) driving framework that aligns with users' long-term driving habits and adapts to real-time user instructions. DMW learns a user embedding from our personalized driving dataset collected across multiple real drivers and conditions the policy on this embedding during planning, while natural language instructions provide additional short-term guidance. Closed-loop evaluation on the Bench2Drive benchmark demonstrates that DMW improves style instruction adaptation, and user studies show that its generated behaviors are recognizable as each driver's own style, highlighting personalization as a key capability for human-centered autonomous driving. Our data and code are available at https://dmw-cvpr.github.io/.

Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

Abstract

Human driving behavior is inherently personal, which is shaped by long-term habits and influenced by short-term intentions. Individuals differ in how they accelerate, brake, merge, yield, and overtake across diverse situations. However, existing end-to-end autonomous driving systems either optimize for generic objectives or rely on fixed driving modes, lacking the ability to adapt to individual preferences or interpret natural language intent. To address this gap, we propose Drive My Way (DMW), a personalized Vision-Language-Action (VLA) driving framework that aligns with users' long-term driving habits and adapts to real-time user instructions. DMW learns a user embedding from our personalized driving dataset collected across multiple real drivers and conditions the policy on this embedding during planning, while natural language instructions provide additional short-term guidance. Closed-loop evaluation on the Bench2Drive benchmark demonstrates that DMW improves style instruction adaptation, and user studies show that its generated behaviors are recognizable as each driver's own style, highlighting personalization as a key capability for human-centered autonomous driving. Our data and code are available at https://dmw-cvpr.github.io/.

Paper Structure

This paper contains 21 sections, 2 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Drive My Way (DMW) achieves end-to-end personalized driving via both long-term preference alignment and short-term style instruction adaptation.
  • Figure 2: An overview of the Personal Driving Dataset, which consists of the driving data and structured driver profile data.
  • Figure 3: An overview of the DMW framework with a pretrained VLA backbone. The model takes in front-view camera images, instructions, route target points, and user profile as inputs, while the motion predictor outputs route and speed waypoints, which derive the base action (throttle, steer angle). The residual decoder outputs a discrete residual applied to the base to produce the final personalized action.
  • Figure 4: The contrastive learning mechanism on the long-term preference encoder and route processor.
  • Figure 5: The fine-tuning process and reward generation for short-term instruction alignment.
  • ...and 2 more figures