Table of Contents
Fetching ...

Benchmarking Autonomous Vehicles: A Driver Foundation Model Framework

Yuxin Zhang, Cheng Wang, Hubert P. H. Shum

TL;DR

The paper addresses the gap between AV performance and human driving in safety, comfort, efficiency, and energy economy by proposing a Driver Foundation Model (DFM) built on a large-scale drone-derived dataset. It introduces a multi-modal, transformer-based architecture that outputs How, What, Where, When and Why information, enabling comprehensive benchmarking and verification of AVs against human baselines. Key contributions include a 7.5 million-trajectory drone dataset across diverse urban ODDs, a DFM architecture with language, motion, attributes, and environment encoders, and a suite of benchmarks for safety, comfort, efficiency, and energy economy with probabilistic and explainable outputs. This framework offers a scalable, human-centric path to specification, validation, and deployment of AVs, aligning technical performance with social acceptance and practical operating conditions.

Abstract

Autonomous vehicles (AVs) are poised to revolutionize global transportation systems. However, its widespread acceptance and market penetration remain significantly below expectations. This gap is primarily driven by persistent challenges in safety, comfort, commuting efficiency and energy economy when compared to the performance of experienced human drivers. We hypothesize that these challenges can be addressed through the development of a driver foundation model (DFM). Accordingly, we propose a framework for establishing DFMs to comprehensively benchmark AVs. Specifically, we describe a large-scale dataset collection strategy for training a DFM, discuss the core functionalities such a model should possess, and explore potential technical solutions to realize these functionalities. We further present the utility of the DFM across the operational spectrum, from defining human-centric safety envelopes to establishing benchmarks for energy economy. Overall, We aim to formalize the DFM concept and introduce a new paradigm for the systematic specification, verification and validation of AVs.

Benchmarking Autonomous Vehicles: A Driver Foundation Model Framework

TL;DR

The paper addresses the gap between AV performance and human driving in safety, comfort, efficiency, and energy economy by proposing a Driver Foundation Model (DFM) built on a large-scale drone-derived dataset. It introduces a multi-modal, transformer-based architecture that outputs How, What, Where, When and Why information, enabling comprehensive benchmarking and verification of AVs against human baselines. Key contributions include a 7.5 million-trajectory drone dataset across diverse urban ODDs, a DFM architecture with language, motion, attributes, and environment encoders, and a suite of benchmarks for safety, comfort, efficiency, and energy economy with probabilistic and explainable outputs. This framework offers a scalable, human-centric path to specification, validation, and deployment of AVs, aligning technical performance with social acceptance and practical operating conditions.

Abstract

Autonomous vehicles (AVs) are poised to revolutionize global transportation systems. However, its widespread acceptance and market penetration remain significantly below expectations. This gap is primarily driven by persistent challenges in safety, comfort, commuting efficiency and energy economy when compared to the performance of experienced human drivers. We hypothesize that these challenges can be addressed through the development of a driver foundation model (DFM). Accordingly, we propose a framework for establishing DFMs to comprehensively benchmark AVs. Specifically, we describe a large-scale dataset collection strategy for training a DFM, discuss the core functionalities such a model should possess, and explore potential technical solutions to realize these functionalities. We further present the utility of the DFM across the operational spectrum, from defining human-centric safety envelopes to establishing benchmarks for energy economy. Overall, We aim to formalize the DFM concept and introduce a new paradigm for the systematic specification, verification and validation of AVs.
Paper Structure (6 sections, 3 figures)

This paper contains 6 sections, 3 figures.

Figures (3)

  • Figure 1: One example to illustrate the extracted information (e.g., object class, velocity, position) from drone-based data collection strategy.
  • Figure 2: Our exemplary scenario coverage for an urban ODD. (a) residential apartment area; (b) urban arterial road; (c) intersections; (d) on-ramp; (e) expressway; (f) off-ramp; (g) roundabouts; (h) accident/construction zone; (i) icy and snowy road; (j) parking lot.
  • Figure 3: The proposed DFM framework. A multi-modal encoder and a multi-task decoder are the two main components to address various questions (see three typical use cases (UCs)) from users and to realize multi-functionalities for benchmarking AVs.