Table of Contents
Fetching ...

SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception

Manideep Reddy Aliminati, Bharatesh Chakravarthi, Aayush Atul Verma, Arpitsinh Vaghela, Hua Wei, Xuesong Zhou, Yezhou Yang

TL;DR

SEVD addresses the scarcity of synthetic, multi-view event-based driving data by leveraging the CARLA simulator to produce synchronized $\langle x, y, p, t \rangle$ event streams from six ego DVS cameras and four fixed DVS sensors, along with RGB, depth, optical flow, semantic, and instance data. The dataset provides extensive annotations (2D/3D bounding boxes in COCO, Pascal VOC, KITTI) across diverse lighting, weather, and scene types, totaling $27\,\text{h}$ fixed and $31\,\text{h}$ ego event data (plus other sensor data) and over $9\text{M}$ bounding boxes. Baselines with state-of-the-art event-based detectors (RVT, RED) and a frame-based detector (YOLOv8) establish 2D detection benchmarks and reveal synthetic-to-real generalization potential, including transfer to real Prophesee data. SEVD thus enables robust evaluation and development of multi-view, high-temporal-resolution perception for autonomous driving and V2I applications, supporting research on occlusion handling, domain shifts, and cooperative perception.

Abstract

Recently, event-based vision sensors have gained attention for autonomous driving applications, as conventional RGB cameras face limitations in handling challenging dynamic conditions. However, the availability of real-world and synthetic event-based vision datasets remains limited. In response to this gap, we present SEVD, a first-of-its-kind multi-view ego, and fixed perception synthetic event-based dataset using multiple dynamic vision sensors within the CARLA simulator. Data sequences are recorded across diverse lighting (noon, nighttime, twilight) and weather conditions (clear, cloudy, wet, rainy, foggy) with domain shifts (discrete and continuous). SEVD spans urban, suburban, rural, and highway scenes featuring various classes of objects (car, truck, van, bicycle, motorcycle, and pedestrian). Alongside event data, SEVD includes RGB imagery, depth maps, optical flow, semantic, and instance segmentation, facilitating a comprehensive understanding of the scene. Furthermore, we evaluate the dataset using state-of-the-art event-based (RED, RVT) and frame-based (YOLOv8) methods for traffic participant detection tasks and provide baseline benchmarks for assessment. Additionally, we conduct experiments to assess the synthetic event-based dataset's generalization capabilities. The dataset is available at https://eventbasedvision.github.io/SEVD

SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception

TL;DR

SEVD addresses the scarcity of synthetic, multi-view event-based driving data by leveraging the CARLA simulator to produce synchronized event streams from six ego DVS cameras and four fixed DVS sensors, along with RGB, depth, optical flow, semantic, and instance data. The dataset provides extensive annotations (2D/3D bounding boxes in COCO, Pascal VOC, KITTI) across diverse lighting, weather, and scene types, totaling fixed and ego event data (plus other sensor data) and over bounding boxes. Baselines with state-of-the-art event-based detectors (RVT, RED) and a frame-based detector (YOLOv8) establish 2D detection benchmarks and reveal synthetic-to-real generalization potential, including transfer to real Prophesee data. SEVD thus enables robust evaluation and development of multi-view, high-temporal-resolution perception for autonomous driving and V2I applications, supporting research on occlusion handling, domain shifts, and cooperative perception.

Abstract

Recently, event-based vision sensors have gained attention for autonomous driving applications, as conventional RGB cameras face limitations in handling challenging dynamic conditions. However, the availability of real-world and synthetic event-based vision datasets remains limited. In response to this gap, we present SEVD, a first-of-its-kind multi-view ego, and fixed perception synthetic event-based dataset using multiple dynamic vision sensors within the CARLA simulator. Data sequences are recorded across diverse lighting (noon, nighttime, twilight) and weather conditions (clear, cloudy, wet, rainy, foggy) with domain shifts (discrete and continuous). SEVD spans urban, suburban, rural, and highway scenes featuring various classes of objects (car, truck, van, bicycle, motorcycle, and pedestrian). Alongside event data, SEVD includes RGB imagery, depth maps, optical flow, semantic, and instance segmentation, facilitating a comprehensive understanding of the scene. Furthermore, we evaluate the dataset using state-of-the-art event-based (RED, RVT) and frame-based (YOLOv8) methods for traffic participant detection tasks and provide baseline benchmarks for assessment. Additionally, we conduct experiments to assess the synthetic event-based dataset's generalization capabilities. The dataset is available at https://eventbasedvision.github.io/SEVD
Paper Structure (13 sections, 1 equation, 6 figures, 4 tables)

This paper contains 13 sections, 1 equation, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Bird's Eye View of Ego and Fixed Perception Scenario: (a) Shows the six views (Front-Left, Front, Front-Right, Rear-Left, Rear, Rear-Right) from an ego vehicle perception (highlighted in red circle) depicted through event-based and its corresponding RGB frames. (b) Shows four views of an intersection from fixed cameras (C1, C2, C3, C4), with event-based and RGB frames for each view.
  • Figure 2: Navigating the Dynamic Landscape of Road Traffic: A glimpse of ego perception data offering six views through event-based vision supported by RGB, depth, optical-flow, semantic, and instance segmentation sensor data generated using CARLA.
  • Figure 3: (a) Visual Representation (event and RGB) of SEVD Dataset Features: The scene diversity (top row) considered during data generation from the urban to the rural, weather variability (middle row) captured in the dataset, ranging from clear skies to foggy scenarios, and the dynamic conditions (bottom row) showcasing sequences with continuously shifting parameters, mirroring real-world driving scenarios. (b) Dataset Distribution: shows the distribution of instances for each class in both ego and fixed perception scenarios.
  • Figure 4: Qualitative Results: Showcasing event-based and frame-based detection of different classes of objects in ego (left column) and fixed (right column) perception scenarios.
  • Figure 5: Real-World Fixed Perception Event-Data Acquisition: Data captured at an intersection using the high-resolution Prophesee EVK4 HD event camera (left) similar to a setting used in CARLA (right).
  • ...and 1 more figures