100 Drivers, 2200 km: A Natural Dataset of Driving Style toward Human-centered Intelligent Driving Systems
Chaopeng Zhang, Wenshuo Wang, Zhaokun Chen, Junqiang Xi
TL;DR
The paper tackles the lack of standardized driving-style benchmarks by introducing the 100-DrivingStyle dataset, a naturalistic driving corpus with 100 drivers recording on a fixed route covering urban and highway scenarios. It combines complete CAN-based operation data with subjective and expert driving-style tags collected via a five-point Likert questionnaire, enabling objective-ground-truth benchmarking of style analysis. The authors apply factor analysis and six classifiers (including SVM and NBC) to quantify subjective-objective consistency, demonstrating the dataset's utility for evaluating driving-style recognition and informing human-centered driving-system design. The dataset, captured at high fidelity with long per-driver recordings, provides a valuable resource for driver modeling, behavior prediction, and long-term driving style studies in real-world conditions.
Abstract
Effective driving style analysis is critical to developing human-centered intelligent driving systems that consider drivers' preferences. However, the approaches and conclusions of most related studies are diverse and inconsistent because no unified datasets tagged with driving styles exist as a reliable benchmark. The absence of explicit driving style labels makes verifying different approaches and algorithms difficult. This paper provides a new benchmark by constructing a natural dataset of Driving Style (100-DrivingStyle) tagged with the subjective evaluation of 100 drivers' driving styles. In this dataset, the subjective quantification of each driver's driving style is from themselves and an expert according to the Likert-scale questionnaire. The testing routes are selected to cover various driving scenarios, including highways, urban, highway ramps, and signalized traffic. The collected driving data consists of lateral and longitudinal manipulation information, including steering angle, steering speed, lateral acceleration, throttle position, throttle rate, brake pressure, etc. This dataset is the first to provide detailed manipulation data with driving-style tags, and we demonstrate its benchmark function using six classifiers. The 100-DrivingStyle dataset is available via https://github.com/chaopengzhang/100-DrivingStyle-Dataset
