Table of Contents
Fetching ...

Data-driven Camera and Lidar Simulation Models for Autonomous Driving: A Review from Generative Models to Volume Renderers

Hamed Haghighi, Xiaomeng Wang, Hao Jing, Mehrdad Dianati

TL;DR

The paper surveys data-driven camera and Lidar sensor simulation methods for autonomous driving, contrasting generative models (GANs, diffusion, auto-regressive) with explicit volume-rendering approaches (NeRFs, 3D Gaussian Splatting). It surveys the spectrum of GAN-based, diffusion-based, and miscellaneous methods, detailing input-output configurations, and evaluates them along 3D scene representation, realism, speed, stability, and diversity. It also outlines evaluation strategies (qualitative and quantitative) and identifies gaps such as standard benchmarks, real-time applicability, and trustworthy generalisation, offering directions like hybrid physics-data models and enhanced volume rendering for dynamic scenes. The practical impact lies in guiding virtual testing workflows and synthetic data generation for robust ADS perception systems. Overall, the review clarifies the trade-offs between realism, speed, and controllability across data-driven sensor simulators and highlights where future work should focus to enable reliable, scalable ADS testing.

Abstract

Perception sensors, particularly camera and Lidar, are key elements of Autonomous Driving Systems (ADS) that enable them to comprehend their surroundings to informed driving and control decisions. Therefore, developing realistic simulation models for these sensors is essential for conducting effective simulation-based testing of ADS. Moreover, the rise of deep learning-based perception models has increased the utility of sensor simulation models for synthesising diverse training datasets. The traditional sensor simulation models rely on computationally expensive physics-based algorithms, specifically in complex systems such as ADS. Hence, the current potential resides in data-driven approaches, fuelled by the exceptional performance of deep generative models in capturing high-dimensional data distribution and volume renderers in accurately representing scenes. This paper reviews the current state-of-the-art data-driven camera and Lidar simulation models and their evaluation methods. It explores a spectrum of models from the novel perspective of generative models and volume renderers. Generative models are discussed in terms of their input-output types, while volume renderers are categorised based on their input encoding. Finally, the paper illustrates commonly used evaluation techniques for assessing sensor simulation models and highlights the existing research gaps in the area.

Data-driven Camera and Lidar Simulation Models for Autonomous Driving: A Review from Generative Models to Volume Renderers

TL;DR

The paper surveys data-driven camera and Lidar sensor simulation methods for autonomous driving, contrasting generative models (GANs, diffusion, auto-regressive) with explicit volume-rendering approaches (NeRFs, 3D Gaussian Splatting). It surveys the spectrum of GAN-based, diffusion-based, and miscellaneous methods, detailing input-output configurations, and evaluates them along 3D scene representation, realism, speed, stability, and diversity. It also outlines evaluation strategies (qualitative and quantitative) and identifies gaps such as standard benchmarks, real-time applicability, and trustworthy generalisation, offering directions like hybrid physics-data models and enhanced volume rendering for dynamic scenes. The practical impact lies in guiding virtual testing workflows and synthetic data generation for robust ADS perception systems. Overall, the review clarifies the trade-offs between realism, speed, and controllability across data-driven sensor simulators and highlights where future work should focus to enable reliable, scalable ADS testing.

Abstract

Perception sensors, particularly camera and Lidar, are key elements of Autonomous Driving Systems (ADS) that enable them to comprehend their surroundings to informed driving and control decisions. Therefore, developing realistic simulation models for these sensors is essential for conducting effective simulation-based testing of ADS. Moreover, the rise of deep learning-based perception models has increased the utility of sensor simulation models for synthesising diverse training datasets. The traditional sensor simulation models rely on computationally expensive physics-based algorithms, specifically in complex systems such as ADS. Hence, the current potential resides in data-driven approaches, fuelled by the exceptional performance of deep generative models in capturing high-dimensional data distribution and volume renderers in accurately representing scenes. This paper reviews the current state-of-the-art data-driven camera and Lidar simulation models and their evaluation methods. It explores a spectrum of models from the novel perspective of generative models and volume renderers. Generative models are discussed in terms of their input-output types, while volume renderers are categorised based on their input encoding. Finally, the paper illustrates commonly used evaluation techniques for assessing sensor simulation models and highlights the existing research gaps in the area.
Paper Structure (54 sections, 14 equations, 17 figures, 14 tables)

This paper contains 54 sections, 14 equations, 17 figures, 14 tables.

Figures (17)

  • Figure 1: Categorisation of data-driven camera and Lidar simulation models for ADS.
  • Figure 2: An overview of data-driven models, including generative models and volume renderers, that are widely used for camera and Lidar simulation in ADS. These data-driven models contain generative approaches such as (a) GANs NIPS2014_5ca3e9b1, (b) denoising diffusion models 10.5555/3495724.3496298, and (c) auto-regressive models, while volume renderers include (d) NeRFs 10.1145/3503250 and (e) 3D Gaussian splatting models 10.1145/3592433. The car image in (c) is sourced from ImageNet deng2009imagenet.
  • Figure 3: Two-stage synthesis pipeline of the Semantic Bottleneck-GAN 49041, an unconditional GAN model for RGB image synthesis.
  • Figure 4: The multi-scale synthesis process of pix2pixHD Wang2017HighResolutionIS, a paired I2I model based on GANs.
  • Figure 5: The training data flow of PCT xiao2021synlidar, an unpaired data translation model based on GANs for sim-to-real mapping of Lidar point clouds.
  • ...and 12 more figures