Table of Contents
Fetching ...

A Unified Generative Framework for Realistic Lidar Simulation in Autonomous Driving Systems

Hamed Haghighi, Mehrdad Dianati, Valentina Donzella, Kurt Debattista

TL;DR

This work proposes a unified generative framework to enhance LiDAR simulation fidelity, and proposes a novel controllable LiDAR point cloud generative model, CoLiGen, which employs the novel controllable LiDAR point cloud generative model, CoLiGen, to translate the images.

Abstract

Simulation models for perception sensors are integral components of automotive simulators used for the virtual Verification and Validation (V\&V) of Autonomous Driving Systems (ADS). These models also serve as powerful tools for generating synthetic datasets to train deep learning-based perception models. Lidar is a widely used sensor type among the perception sensors for ADS due to its high precision in 3D environment scanning. However, developing realistic Lidar simulation models is a significant technical challenge. In particular, unrealistic models can result in a large gap between the synthesised and real-world point clouds, limiting their effectiveness in ADS applications. Recently, deep generative models have emerged as promising solutions to synthesise realistic sensory data. However, for Lidar simulation, deep generative models have been primarily hybridised with conventional algorithms, leaving unified generative approaches largely unexplored in the literature. Motivated by this research gap, we propose a unified generative framework to enhance Lidar simulation fidelity. Our proposed framework projects Lidar point clouds into depth-reflectance images via a lossless transformation, and employs our novel Controllable Lidar point cloud Generative model, CoLiGen, to translate the images. We extensively evaluate our CoLiGen model, comparing it with the state-of-the-art image-to-image translation models using various metrics to assess the realness, faithfulness, and performance of a downstream perception model. Our results show that CoLiGen exhibits superior performance across most metrics. The dataset and source code for this research are available at https://github.com/hamedhaghighi/CoLiGen.git.

A Unified Generative Framework for Realistic Lidar Simulation in Autonomous Driving Systems

TL;DR

This work proposes a unified generative framework to enhance LiDAR simulation fidelity, and proposes a novel controllable LiDAR point cloud generative model, CoLiGen, which employs the novel controllable LiDAR point cloud generative model, CoLiGen, to translate the images.

Abstract

Simulation models for perception sensors are integral components of automotive simulators used for the virtual Verification and Validation (V\&V) of Autonomous Driving Systems (ADS). These models also serve as powerful tools for generating synthetic datasets to train deep learning-based perception models. Lidar is a widely used sensor type among the perception sensors for ADS due to its high precision in 3D environment scanning. However, developing realistic Lidar simulation models is a significant technical challenge. In particular, unrealistic models can result in a large gap between the synthesised and real-world point clouds, limiting their effectiveness in ADS applications. Recently, deep generative models have emerged as promising solutions to synthesise realistic sensory data. However, for Lidar simulation, deep generative models have been primarily hybridised with conventional algorithms, leaving unified generative approaches largely unexplored in the literature. Motivated by this research gap, we propose a unified generative framework to enhance Lidar simulation fidelity. Our proposed framework projects Lidar point clouds into depth-reflectance images via a lossless transformation, and employs our novel Controllable Lidar point cloud Generative model, CoLiGen, to translate the images. We extensively evaluate our CoLiGen model, comparing it with the state-of-the-art image-to-image translation models using various metrics to assess the realness, faithfulness, and performance of a downstream perception model. Our results show that CoLiGen exhibits superior performance across most metrics. The dataset and source code for this research are available at https://github.com/hamedhaghighi/CoLiGen.git.
Paper Structure (20 sections, 10 equations, 8 figures, 3 tables)

This paper contains 20 sections, 10 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: We introduce a unified generative framework to enhance the realism of Lidar simulation. Our framework directly translates all Lidar data attributes into more realistic representations, enhancing both the shape characteristics, derived from the points' depth and raydrop pattern, as well as reflectance properties. This approach substantially improves the overall fidelity of Lidar simulation.
  • Figure 2: Overview of the proposed generative framework. Our framework begins by projecting the simulated point cloud into depth-reflectance images and a semantic label layout. Subsequently, these representations are translated using our novel CoLiGen model to enhance their realism. Finally, the more realistic point cloud is reconstructed from the synthesised depth-reflectance images, thus completing the process.
  • Figure 3: Overview of the inference and training losses of our CoLiGen model. The model consists of $\mathbf{G_{enc}}$, $\mathbf{E}$, and $\mathbf{G_{dec}}$ sub-networks. $\mathbf{G_{enc}}$ encodes depth-reflectance image $\mathbf{d}$ and $\mathbf{E}$ encodes auxiliary image $\mathbf{c}$. These two codes are summed up to form $\mathbf{z}$ which are then decoded by $\mathbf{G_{dec}}$ to output $\mathbf{\tilde{y}}$. The Raydrop Synthesis (RS) module renders raydrop and synthesises the final image $\mathbf{\hat{y}}$. The network is trained with GAN loss 10.5555/2969033.2969125 using the discriminator $\mathbf{D}$, and PatchNCE loss park2020cut using the features extracted by $\mathbf{G_{enc}}$.
  • Figure 4: Overview of Raydrop Synthesis (RS) module. The RS module inputs $\mathbf{\tilde{y}}$ consisting of raydrop logits $\mathbf{\tilde{y}_{p}}$ and complete depth-reflectance image $\mathbf{\tilde{y}_{l}}$. The Gumbel-Sigmoid function journals/corr/JangGP16 is applied to raydrop logits, yielding the binary raydrop mask $\mathbf{\tilde{y}_{m}}$. Lastly, the raydrop mask is multiplied with the complete image, resulting in the final synthesised image $\mathbf{\hat{y}}$.
  • Figure 5: Overview of the patchNCEpark2020cut loss. We adopt a patch-wise contrastive objective to enforce consistency during the unpaired image translation. This objective ensures that a patch in the image synthesised by the generator's $\mathbf{G}$ has the highest similarity to the patch of the simulated image in the same position, compared to patches in other positions.
  • ...and 3 more figures