Table of Contents
Fetching ...

TUMTraf Event: Calibration and Fusion Resulting in a Dataset for Roadside Event-Based and RGB Cameras

Christian Creß, Walter Zimmer, Nils Purschke, Bach Ngoc Doan, Sven Kirchner, Venkatnarayanan Lakshminarasimhan, Leah Strand, Alois C. Knoll

TL;DR

This work extended the targetless calibration approach with clustering methods to handle multiple moving objects, and developed an Early Fusion, Simple Late Fusion, and a novel Spatiotemporal Late Fusion method to handle multiple moving objects.

Abstract

Event-based cameras are predestined for Intelligent Transportation Systems (ITS). They provide very high temporal resolution and dynamic range, which can eliminate motion blur and improve detection performance at night. However, event-based images lack color and texture compared to images from a conventional RGB camera. Considering that, data fusion between event-based and conventional cameras can combine the strengths of both modalities. For this purpose, extrinsic calibration is necessary. To the best of our knowledge, no targetless calibration between event-based and RGB cameras can handle multiple moving objects, nor does data fusion optimized for the domain of roadside ITS exist. Furthermore, synchronized event-based and RGB camera datasets considering roadside perspective are not yet published. To fill these research gaps, based on our previous work, we extended our targetless calibration approach with clustering methods to handle multiple moving objects. Furthermore, we developed an early fusion, simple late fusion, and a novel spatiotemporal late fusion method. Lastly, we published the TUMTraf Event Dataset, which contains more than 4,111 synchronized event-based and RGB images with 50,496 labeled 2D boxes. During our extensive experiments, we verified the effectiveness of our calibration method with multiple moving objects. Furthermore, compared to a single RGB camera, we increased the detection performance of up to +9 % mAP in the day and up to +13 % mAP during the challenging night with our presented event-based sensor fusion methods. The TUMTraf Event Dataset is available at https://innovation-mobility.com/tumtraf-dataset.

TUMTraf Event: Calibration and Fusion Resulting in a Dataset for Roadside Event-Based and RGB Cameras

TL;DR

This work extended the targetless calibration approach with clustering methods to handle multiple moving objects, and developed an Early Fusion, Simple Late Fusion, and a novel Spatiotemporal Late Fusion method to handle multiple moving objects.

Abstract

Event-based cameras are predestined for Intelligent Transportation Systems (ITS). They provide very high temporal resolution and dynamic range, which can eliminate motion blur and improve detection performance at night. However, event-based images lack color and texture compared to images from a conventional RGB camera. Considering that, data fusion between event-based and conventional cameras can combine the strengths of both modalities. For this purpose, extrinsic calibration is necessary. To the best of our knowledge, no targetless calibration between event-based and RGB cameras can handle multiple moving objects, nor does data fusion optimized for the domain of roadside ITS exist. Furthermore, synchronized event-based and RGB camera datasets considering roadside perspective are not yet published. To fill these research gaps, based on our previous work, we extended our targetless calibration approach with clustering methods to handle multiple moving objects. Furthermore, we developed an early fusion, simple late fusion, and a novel spatiotemporal late fusion method. Lastly, we published the TUMTraf Event Dataset, which contains more than 4,111 synchronized event-based and RGB images with 50,496 labeled 2D boxes. During our extensive experiments, we verified the effectiveness of our calibration method with multiple moving objects. Furthermore, compared to a single RGB camera, we increased the detection performance of up to +9 % mAP in the day and up to +13 % mAP during the challenging night with our presented event-based sensor fusion methods. The TUMTraf Event Dataset is available at https://innovation-mobility.com/tumtraf-dataset.
Paper Structure (15 sections, 13 equations, 10 figures, 6 tables)

This paper contains 15 sections, 13 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: This figure shows the sensor fusion between event-based and RGB cameras and its impact during a sunny day and a night in sleet. The blue bounding boxes in the event-based respectively RGB camera section represent detections without fusion. However, in the sensor fusion section, a green bounding box indicates that an object was detected by the event-based and the RGB camera. A blue bounding box shows detection exclusively by the RGB camera, and a red bounding box shows detection exclusively by the event-based camera (not available here). A unique track ID is assigned when objects are detected in several frames.
  • Figure 2: The main components of our targetless extrinsic calibration algorithm are "pre-processing conventional camera," "pre-processing event-based camera," and "clustering-based targetless extrinsic calibration." The event-based camera indicates moving image regions. However, we identify such areas in the RGB camera by analyzing the last three images. We extended our previous work Cre.642023672023 with DBSCAN EsterMartinandKriegelHansPeterandSanderJorgandXuXiaowei.1996 and can now handle multiple moving objects. The fundamental goal is to find associations per cluster pair to calculate a global transformation matrix. This approach allows us to calibrate in more complex traffic scenarios.
  • Figure 3: We recorded the TUMTraf Event Dataset at this intersection in Garching near Munich. Besides event-based and RGB cameras, the gantry contains numerous other sensors, e.g., Lidars, which are the basis for the TUMTraf Dataset family.
  • Figure 4: Street lights cause significant noise at night, which must be eliminated with an anti-flickering filter. This phenomenon is due to the lamps operating with 50 Hz alternating current or pulse width modulation.
  • Figure 5: This figure illustrates the processing pipeline of Early Fusion, Simple Late Fusion, and Spatiotemporal Late Fusion. Early Fusion uses the raw images from the RGB and event-based cameras. On the other hand, the late fusion methods operate on the detections based on the individual images and the motion mask of the RGB camera. In addition, Spatiotemporal Late Fusion utilizes tracking information of each object for its fusion decision.
  • ...and 5 more figures