Data-driven Camera and Lidar Simulation Models for Autonomous Driving: A Review from Generative Models to Volume Renderers
Hamed Haghighi, Xiaomeng Wang, Hao Jing, Mehrdad Dianati
TL;DR
The paper surveys data-driven camera and Lidar sensor simulation methods for autonomous driving, contrasting generative models (GANs, diffusion, auto-regressive) with explicit volume-rendering approaches (NeRFs, 3D Gaussian Splatting). It surveys the spectrum of GAN-based, diffusion-based, and miscellaneous methods, detailing input-output configurations, and evaluates them along 3D scene representation, realism, speed, stability, and diversity. It also outlines evaluation strategies (qualitative and quantitative) and identifies gaps such as standard benchmarks, real-time applicability, and trustworthy generalisation, offering directions like hybrid physics-data models and enhanced volume rendering for dynamic scenes. The practical impact lies in guiding virtual testing workflows and synthetic data generation for robust ADS perception systems. Overall, the review clarifies the trade-offs between realism, speed, and controllability across data-driven sensor simulators and highlights where future work should focus to enable reliable, scalable ADS testing.
Abstract
Perception sensors, particularly camera and Lidar, are key elements of Autonomous Driving Systems (ADS) that enable them to comprehend their surroundings to informed driving and control decisions. Therefore, developing realistic simulation models for these sensors is essential for conducting effective simulation-based testing of ADS. Moreover, the rise of deep learning-based perception models has increased the utility of sensor simulation models for synthesising diverse training datasets. The traditional sensor simulation models rely on computationally expensive physics-based algorithms, specifically in complex systems such as ADS. Hence, the current potential resides in data-driven approaches, fuelled by the exceptional performance of deep generative models in capturing high-dimensional data distribution and volume renderers in accurately representing scenes. This paper reviews the current state-of-the-art data-driven camera and Lidar simulation models and their evaluation methods. It explores a spectrum of models from the novel perspective of generative models and volume renderers. Generative models are discussed in terms of their input-output types, while volume renderers are categorised based on their input encoding. Finally, the paper illustrates commonly used evaluation techniques for assessing sensor simulation models and highlights the existing research gaps in the area.
