Table of Contents
Fetching ...

VRUD: A Drone Dataset for Complex Vehicle-VRU Interactions within Mixed Traffic

Ziyu Wang, Hongrui Kou, Cheng Wang, Ruochen Li, Hubert P. H. Shum, Amir Atapour-Abarghouei, Yuxin Zhang

Abstract

The Operational Design Domain (ODD) of urbanoriented Level 4 (L4) autonomous driving, especially for autonomous robotaxis, confronts formidable challenges in complex urban mixed traffic environments. These challenges stem mainly from the high density of Vulnerable Road Users (VRUs) and their highly uncertain and unpredictable interaction behaviors. However, existing open-source datasets predominantly focus on structured scenarios such as highways or regulated intersections, leaving a critical gap in data representing chaotic, unstructured urban environments. To address this, this paper proposes an efficient, high-precision method for constructing drone-based datasets and establishes the Vehicle-Vulnerable Road User Interaction Dataset (VRUD), as illustrated in Figure 1. Distinct from prior works, VRUD is collected from typical "Urban Villages" in Shenzhen, characterized by loose traffic supervision and extreme occlusion. The dataset comprises 4 hours of 4K/30Hz recording, containing 11,479 VRU trajectories and 1,939 vehicle trajectories. A key characteristic of VRUD is its composition: VRUs account for about 87% of all traffic participants, significantly exceeding the proportions in existing benchmarks. Furthermore, unlike datasets that only provide raw trajectories, we extracted 4,002 multi-agent interaction scenarios based on a novel Vector Time to Collision (VTTC) threshold, supported by standard OpenDRIVE HD maps. This study provides valuable, rare edge-case resources for enhancing the safety performance of ADS in complex, unstructured urban environments. To facilitate further research, we have made the VRUD dataset open-source at: https://zzi4.github.io/VRUD/.

VRUD: A Drone Dataset for Complex Vehicle-VRU Interactions within Mixed Traffic

Abstract

The Operational Design Domain (ODD) of urbanoriented Level 4 (L4) autonomous driving, especially for autonomous robotaxis, confronts formidable challenges in complex urban mixed traffic environments. These challenges stem mainly from the high density of Vulnerable Road Users (VRUs) and their highly uncertain and unpredictable interaction behaviors. However, existing open-source datasets predominantly focus on structured scenarios such as highways or regulated intersections, leaving a critical gap in data representing chaotic, unstructured urban environments. To address this, this paper proposes an efficient, high-precision method for constructing drone-based datasets and establishes the Vehicle-Vulnerable Road User Interaction Dataset (VRUD), as illustrated in Figure 1. Distinct from prior works, VRUD is collected from typical "Urban Villages" in Shenzhen, characterized by loose traffic supervision and extreme occlusion. The dataset comprises 4 hours of 4K/30Hz recording, containing 11,479 VRU trajectories and 1,939 vehicle trajectories. A key characteristic of VRUD is its composition: VRUs account for about 87% of all traffic participants, significantly exceeding the proportions in existing benchmarks. Furthermore, unlike datasets that only provide raw trajectories, we extracted 4,002 multi-agent interaction scenarios based on a novel Vector Time to Collision (VTTC) threshold, supported by standard OpenDRIVE HD maps. This study provides valuable, rare edge-case resources for enhancing the safety performance of ADS in complex, unstructured urban environments. To facilitate further research, we have made the VRUD dataset open-source at: https://zzi4.github.io/VRUD/.

Paper Structure

This paper contains 17 sections, 13 figures, 1 table.

Figures (13)

  • Figure 1: Example of a recorded sequence showing bounding boxes and labels for detected traffic participants. The bounding box color indicates the category of each traffic participant, with historical trajectories rendered in the corresponding color. Each traffic participant is assigned a unique identifier).
  • Figure 2: Two Data Collection Sites for VRUD. (a) depicts an irregular intersection, both featuring two-way single-lane traffic. The four areas divided at the intersection correspond to two snack streets and two residential apartment complexes. (b) shows a two-way single-lane road, with residential apartments lining both sides. At both sites, apartment entrances and bus stops are situated along the road edges. In addition, the parking spaces on either side of the roads are occupied by long-term parked vehicles.
  • Figure 3: Each subplot visualizes all annotated trajectories appearing in the scene: (a) cars, buses, and trucks; (b) pedestrians and cyclists; and (c) motorcycles and tricycles. The comparison reveals that the trajectory patterns of VRUs are significantly more disordered and scattered.
  • Figure 4: (a) shows the result of overlaying the first and last frames of a video captured in a single acquisition. (b) presents the overlay effect of the first and last frames after single-video stabilization processing. It can be clearly observed that the deviation of the background positions between the two frames has been effectively corrected, which ensures the consistency between the image coordinate system of all targets in a single subsequent video and the actual global coordinate system.
  • Figure 5: This figure illustrates the overlay effect of two images after multi-video alignment, which respectively correspond to the fields of view from two separate acquisition sessions. It can be clearly observed from the figure that there are deviations in the acquisition altitude and position between the two sessions.
  • ...and 8 more figures