OnSiteVRU: A High-Resolution Trajectory Dataset for High-Density Vulnerable Road Users
Zhangcun Yan, Jianqing Li, Peng Hang, Jian Sun
TL;DR
The paper tackles the challenge of VRU safety in dense mixed-traffic environments by introducing OnSiteVRU, a high-resolution trajectory dataset collected across intersections, road segments, and urban villages in China. Leveraging multi-source sensing (aerial and onboard) and HD Lanelet2 maps, the dataset comprises about 17,429 trajectories with 0.04 s temporal granularity and includes complete signal-state information and diverse VRU types, including pedestrians and non-motorized vehicles. Key contributions include extensive scenario coverage, high VRU density, rigorous data processing to produce precise world-coordinate trajectories, and a publicly available benchmark with an online/offline competition format for trajectory prediction. The dataset supports traffic flow modeling, VRU behavior analysis, and autonomous driving virtual testing, offering a valuable resource for improving VRU safety in complex urban environments.
Abstract
With the acceleration of urbanization and the growth of transportation demands, the safety of vulnerable road users (VRUs, such as pedestrians and cyclists) in mixed traffic flows has become increasingly prominent, necessitating high-precision and diverse trajectory data to support the development and optimization of autonomous driving systems. However, existing datasets fall short in capturing the diversity and dynamics of VRU behaviors, making it difficult to meet the research demands of complex traffic environments. To address this gap, this study developed the OnSiteVRU datasets, which cover a variety of scenarios, including intersections, road segments, and urban villages. These datasets provide trajectory data for motor vehicles, electric bicycles, and human-powered bicycles, totaling approximately 17,429 trajectories with a precision of 0.04 seconds. The datasets integrate both aerial-view natural driving data and onboard real-time dynamic detection data, along with environmental information such as traffic signals, obstacles, and real-time maps, enabling a comprehensive reconstruction of interaction events. The results demonstrate that VRU\_Data outperforms traditional datasets in terms of VRU density and scene coverage, offering a more comprehensive representation of VRU behavioral characteristics. This provides critical support for traffic flow modeling, trajectory prediction, and autonomous driving virtual testing. The dataset is publicly available for download at: https://www.kaggle.com/datasets/zcyan2/mixed-traffic-trajectory-dataset-in-from-shanghai.
