Table of Contents
Fetching ...

Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios

Richard Marcus, Christian Vogel, Inga Jatzkowski, Niklas Knoop, Marc Stamminger

TL;DR

This work tackles the synthetic-to-real transfer problem for 3D LiDAR object detection in driving scenarios by building a CARLA-based data-generation pipeline with domain randomization and explicit sensor modeling. It demonstrates that carefully modeled LiDAR intensity, environment and vehicle behavior variations, and bounding-box adjustments can enable synthetic data to generalize to KITTI, with fine-tuning on a small real-data subset nearly bridging the gap. The authors provide a modular, open pipeline and multiple sensor variants to study what drives the domain gap, showing that sensor realism and randomization have substantial impact on transfer performance. The approach yields practical benefits for synthetic pretraining and rapid adaptation to real-world data, highlighting pathways for scalable, domain-robust perception in autonomous driving.

Abstract

An important factor in advancing autonomous driving systems is simulation. Yet, there is rather small progress for transferability between the virtual and real world. We revisit this problem for 3D object detection on LiDAR point clouds and propose a dataset generation pipeline based on the CARLA simulator. Utilizing domain randomization strategies and careful modeling, we are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset. Furthermore, we compare different virtual sensor variants to gather insights, which sensor attributes can be responsible for the prevalent domain gap. Finally, fine-tuning with a small portion of real data almost matches the baseline and with the full training set slightly surpasses it.

Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios

TL;DR

This work tackles the synthetic-to-real transfer problem for 3D LiDAR object detection in driving scenarios by building a CARLA-based data-generation pipeline with domain randomization and explicit sensor modeling. It demonstrates that carefully modeled LiDAR intensity, environment and vehicle behavior variations, and bounding-box adjustments can enable synthetic data to generalize to KITTI, with fine-tuning on a small real-data subset nearly bridging the gap. The authors provide a modular, open pipeline and multiple sensor variants to study what drives the domain gap, showing that sensor realism and randomization have substantial impact on transfer performance. The approach yields practical benefits for synthetic pretraining and rapid adaptation to real-world data, highlighting pathways for scalable, domain-robust perception in autonomous driving.

Abstract

An important factor in advancing autonomous driving systems is simulation. Yet, there is rather small progress for transferability between the virtual and real world. We revisit this problem for 3D object detection on LiDAR point clouds and propose a dataset generation pipeline based on the CARLA simulator. Utilizing domain randomization strategies and careful modeling, we are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset. Furthermore, we compare different virtual sensor variants to gather insights, which sensor attributes can be responsible for the prevalent domain gap. Finally, fine-tuning with a small portion of real data almost matches the baseline and with the full training set slightly surpasses it.

Paper Structure

This paper contains 29 sections, 2 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Typical samples from our synthetic dataset. We mimic the KITTI configuration and use multiple maps for data generation
  • Figure 2: Modular Sim2Real Pipeline: The simulation generates an intermediate output. A processing script enhances the realism of the data and prepares it for object detection.
  • Figure 3: Bird-eye-view perspective screenshots of the nine CARLA maps, from which the synthetic dataset was created. Notably, Town10 (bottom left) and Town15 (top right) increase the proportion of urban environments.
  • Figure 4: Original bounding boxes in purple, shrunk ones in green.
  • Figure 5: Bounding box distribution plots: top row shows side view, bottom row BEV; left is synthetic data before sensor randomization, middle after and right KITTI data.
  • ...and 3 more figures