Benchmarking Autonomous Vehicles: A Driver Foundation Model Framework
Yuxin Zhang, Cheng Wang, Hubert P. H. Shum
TL;DR
The paper addresses the gap between AV performance and human driving in safety, comfort, efficiency, and energy economy by proposing a Driver Foundation Model (DFM) built on a large-scale drone-derived dataset. It introduces a multi-modal, transformer-based architecture that outputs How, What, Where, When and Why information, enabling comprehensive benchmarking and verification of AVs against human baselines. Key contributions include a 7.5 million-trajectory drone dataset across diverse urban ODDs, a DFM architecture with language, motion, attributes, and environment encoders, and a suite of benchmarks for safety, comfort, efficiency, and energy economy with probabilistic and explainable outputs. This framework offers a scalable, human-centric path to specification, validation, and deployment of AVs, aligning technical performance with social acceptance and practical operating conditions.
Abstract
Autonomous vehicles (AVs) are poised to revolutionize global transportation systems. However, its widespread acceptance and market penetration remain significantly below expectations. This gap is primarily driven by persistent challenges in safety, comfort, commuting efficiency and energy economy when compared to the performance of experienced human drivers. We hypothesize that these challenges can be addressed through the development of a driver foundation model (DFM). Accordingly, we propose a framework for establishing DFMs to comprehensively benchmark AVs. Specifically, we describe a large-scale dataset collection strategy for training a DFM, discuss the core functionalities such a model should possess, and explore potential technical solutions to realize these functionalities. We further present the utility of the DFM across the operational spectrum, from defining human-centric safety envelopes to establishing benchmarks for energy economy. Overall, We aim to formalize the DFM concept and introduce a new paradigm for the systematic specification, verification and validation of AVs.
