DriveGPT: Scaling Autoregressive Behavior Models for Driving
Xin Huang, Eric M. Wolff, Paul Vernaza, Tung Phan-Minh, Hongge Chen, David S. Hayden, Mark Edmonds, Brian Pierce, Xinxin Chen, Pratik Elias Jacob, Xiaobai Chen, Chingiz Tairbekov, Pratik Agarwal, Tianshi Gao, Yuning Chai, Siddhartha Srinivasa
TL;DR
DriveGPT investigates scaling laws for autoregressive behavior models in autonomous driving by systematically varying data size, model capacity, and compute. It employs a transformer encoder to fuse scene context and an LLM-style autoregressive decoder to generate future agent trajectories as Verlet actions, trained on massive driving data and evaluated in planning, prediction, and closed-loop settings. Key findings show data scaling as the primary bottleneck, with model and compute scaling providing diminishing returns beyond certain points, and the autoregressive decoder delivering robust planning performance and competitive motion prediction after pretraining. The work demonstrates real-time viability for closed-loop driving and offers practical guidance on scaling strategies for safer, more robust autonomous driving systems.
Abstract
We present DriveGPT, a scalable behavior model for autonomous driving. We model driving as a sequential decision-making task, and learn a transformer model to predict future agent states as tokens in an autoregressive fashion. We scale up our model parameters and training data by multiple orders of magnitude, enabling us to explore the scaling properties in terms of dataset size, model parameters, and compute. We evaluate DriveGPT across different scales in a planning task, through both quantitative metrics and qualitative examples, including closed-loop driving in complex real-world scenarios. In a separate prediction task, DriveGPT outperforms state-of-the-art baselines and exhibits improved performance by pretraining on a large-scale dataset, further validating the benefits of data scaling.
