Table of Contents
Fetching ...

Evaluating the Impact of Synthetic Data on Object Detection Tasks in Autonomous Driving

Enes Özeren, Arka Bhowmick

TL;DR

The paper tackles data scarcity in autonomous driving perception by evaluating synthetic data's utility for 2D and 3D object detection. Using real datasets (KITTI, BDD100K, A2D2) and the BIT-TS synthetic dataset across camera and LiDAR modalities, it trains models on real, synthetic, and mixed data to assess generalization with $mAP@50$ at IoU 0.5. Findings show that mixing synthetic with real data generally improves generalization for 2D detection, with mix++ configurations performing best, while 3D detection exhibits more nuanced behavior due to distribution shifts in LiDAR data; nonetheless, certain synthetic-real mixes (e.g., BIT-TS with A2D2) can enhance KITTI performance. Overall, synthetic data proves to be a valuable complementary resource for robust autonomous driving perception, with BIT-TS offering versatile scenario coverage; further work should scale up datasets and expand object categories and modalities.

Abstract

The increasing applications of autonomous driving systems necessitates large-scale, high-quality datasets to ensure robust performance across diverse scenarios. Synthetic data has emerged as a viable solution to augment real-world datasets due to its cost-effectiveness, availability of precise ground-truth labels, and the ability to model specific edge cases. However, synthetic data may introduce distributional differences and biases that could impact model performance in real-world settings. To evaluate the utility and limitations of synthetic data, we conducted controlled experiments using multiple real-world datasets and a synthetic dataset generated by BIT Technology Solutions GmbH. Our study spans two sensor modalities, camera and LiDAR, and investigates both 2D and 3D object detection tasks. We compare models trained on real, synthetic, and mixed datasets, analyzing their robustness and generalization capabilities. Our findings demonstrate that the use of a combination of real and synthetic data improves the robustness and generalization of object detection models, underscoring the potential of synthetic data in advancing autonomous driving technologies.

Evaluating the Impact of Synthetic Data on Object Detection Tasks in Autonomous Driving

TL;DR

The paper tackles data scarcity in autonomous driving perception by evaluating synthetic data's utility for 2D and 3D object detection. Using real datasets (KITTI, BDD100K, A2D2) and the BIT-TS synthetic dataset across camera and LiDAR modalities, it trains models on real, synthetic, and mixed data to assess generalization with at IoU 0.5. Findings show that mixing synthetic with real data generally improves generalization for 2D detection, with mix++ configurations performing best, while 3D detection exhibits more nuanced behavior due to distribution shifts in LiDAR data; nonetheless, certain synthetic-real mixes (e.g., BIT-TS with A2D2) can enhance KITTI performance. Overall, synthetic data proves to be a valuable complementary resource for robust autonomous driving perception, with BIT-TS offering versatile scenario coverage; further work should scale up datasets and expand object categories and modalities.

Abstract

The increasing applications of autonomous driving systems necessitates large-scale, high-quality datasets to ensure robust performance across diverse scenarios. Synthetic data has emerged as a viable solution to augment real-world datasets due to its cost-effectiveness, availability of precise ground-truth labels, and the ability to model specific edge cases. However, synthetic data may introduce distributional differences and biases that could impact model performance in real-world settings. To evaluate the utility and limitations of synthetic data, we conducted controlled experiments using multiple real-world datasets and a synthetic dataset generated by BIT Technology Solutions GmbH. Our study spans two sensor modalities, camera and LiDAR, and investigates both 2D and 3D object detection tasks. We compare models trained on real, synthetic, and mixed datasets, analyzing their robustness and generalization capabilities. Our findings demonstrate that the use of a combination of real and synthetic data improves the robustness and generalization of object detection models, underscoring the potential of synthetic data in advancing autonomous driving technologies.

Paper Structure

This paper contains 11 sections, 1 equation, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Image examples from three datasets: BIT-TS synthetic data (top left), BDD100K (top right), and KITTI (bottom).
  • Figure 2: Point Cloud examples from three datasets: BIT-TS synthetic data (top left), A2D2 (top right), and KITTI (bottom).
  • Figure 3: Test set 2D object detection mAP@50 values for YOLOv7 - 2D Detection Models
  • Figure 4: Test set 3D object detection mAP@50 values for SECOND - 3D Detection Models