Table of Contents
Fetching ...

LidarDM: Generative LiDAR Simulation in a Generated World

Vlas Zyrianov, Henry Che, Zhijian Liu, Shenlong Wang

TL;DR

LidarDM introduces a capable 4D LiDAR generative framework that jointly models a 4D driving world and the resulting LiDAR observations, enabling layout-conditioned, temporally coherent LiDAR videos. It combines a 3D scene generator (SDF-based latent diffusion with map conditioning), dynamic actor generation (GET3D/AvatarClip) and trajectory synthesis with physics-informed ray casting and stochastic raydrop to produce realistic sensor data. The approach achieves state-of-the-art realism and temporal consistency, demonstrates strong map alignment, and can augment real data to improve perception and planning models. This asset-free, controllable simulation pipeline offers a scalable tool for training, evaluating, and testing autonomous driving systems in safety-critical and rare scenarios.

Abstract

We present LidarDM, a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos. LidarDM stands out with two unprecedented capabilities in LiDAR generative modeling: (i) LiDAR generation guided by driving scenarios, offering significant potential for autonomous driving simulations, and (ii) 4D LiDAR point cloud generation, enabling the creation of realistic and temporally coherent sequences. At the heart of our model is a novel integrated 4D world generation framework. Specifically, we employ latent diffusion models to generate the 3D scene, combine it with dynamic actors to form the underlying 4D world, and subsequently produce realistic sensory observations within this virtual environment. Our experiments indicate that our approach outperforms competing algorithms in realism, temporal coherency, and layout consistency. We additionally show that LidarDM can be used as a generative world model simulator for training and testing perception models.

LidarDM: Generative LiDAR Simulation in a Generated World

TL;DR

LidarDM introduces a capable 4D LiDAR generative framework that jointly models a 4D driving world and the resulting LiDAR observations, enabling layout-conditioned, temporally coherent LiDAR videos. It combines a 3D scene generator (SDF-based latent diffusion with map conditioning), dynamic actor generation (GET3D/AvatarClip) and trajectory synthesis with physics-informed ray casting and stochastic raydrop to produce realistic sensor data. The approach achieves state-of-the-art realism and temporal consistency, demonstrates strong map alignment, and can augment real data to improve perception and planning models. This asset-free, controllable simulation pipeline offers a scalable tool for training, evaluating, and testing autonomous driving systems in safety-critical and rare scenarios.

Abstract

We present LidarDM, a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos. LidarDM stands out with two unprecedented capabilities in LiDAR generative modeling: (i) LiDAR generation guided by driving scenarios, offering significant potential for autonomous driving simulations, and (ii) 4D LiDAR point cloud generation, enabling the creation of realistic and temporally coherent sequences. At the heart of our model is a novel integrated 4D world generation framework. Specifically, we employ latent diffusion models to generate the 3D scene, combine it with dynamic actors to form the underlying 4D world, and subsequently produce realistic sensory observations within this virtual environment. Our experiments indicate that our approach outperforms competing algorithms in realism, temporal coherency, and layout consistency. We additionally show that LidarDM can be used as a generative world model simulator for training and testing perception models.
Paper Structure (44 sections, 2 equations, 13 figures, 6 tables)

This paper contains 44 sections, 2 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: We present LidarDM, a novel 4D LiDAR generative model. Our generated LiDAR videos simultaneously enjoy the benefits of being realistic, layout-conditioning, physically plausible, diverse, and temporally coherent.
  • Figure 1: Comparison of Layout-Conditioned LiDAR Generation on Waymo dataset: Our approach significantly outperforms the strong latent-diffusion-based sequential generation baseline in terms of realism, physics plausibility, and coherence with the input layout.
  • Figure 2: Applications of LidarDM: (a) generating LiDAR that aligns well with the map (color boxes highlight the consistency between the lidar and the map) without 3D capturing or modeling; (b) providing sensor data for an existing traffic simulator (Waymax waymax), enabling safety-critical scenarios evaluation from pure sensor data; (c) generate large volume Lidar data with controllable obstacles locations (treated as ground-truth labels, which are free to obtain) to improve perception models via pre-training without expensive data capturing and labelling.
  • Figure 2: More Map-Aligned Qualitative Results. We showcase 4 different frames of the same sequence, with both map-aligned and LiDAR top-down view. We also show the accumulated point clouds, colored by their time index to showcase the temporal consistency.
  • Figure 3: Overview of the LidarDM: Given the input traffic layout at time $t=0$, LidarDM begins by generating actors and the static scene. We then generate the motion of the actors and the egocar, and compose the underlying 4D world. Finally, a generative- and physics-based simulation is used to create realistic 4D sensor data.
  • ...and 8 more figures