Table of Contents
Fetching ...

LOGen: Toward Lidar Object Generation by Point Diffusion

Ellington Kirby, Mickael Chen, Renaud Marlet, Nermin Samet

TL;DR

<3-5 sentence high-level summary> LOGen introduces a diffusion-based transformer tailored for LiDAR object generation, enabling controllable synthesis of object-centered 3D point clouds with intensity. It uses a novel LiDAR object parameterization, PointNet-based point embeddings, and conditioning on viewpoint and distance to produce realistic objects, evaluated on nuScenes and KITTI-360 with new LiDAR-specific metrics. The results show LOGen achieves superior object-level fidelity and improves scene-level semantics and downstream segmentation, including cross-dataset augmentation benefits. This work establishes a first standard for LiDAR object generation and suggests future directions for speed, memory efficiency, and broader applications in simulation and data augmentation.

Abstract

The generation of LiDAR scans is a growing topic with diverse applications to autonomous driving. However, scan generation remains challenging, especially when compared to the rapid advancement of image and 3D object generation. We consider the task of LiDAR object generation, requiring models to produce 3D objects as viewed by a LiDAR scan. It focuses LiDAR scan generation on a key aspect of scenes, the objects, while also benefiting from advancements in 3D object generative methods. We introduce a novel diffusion-based model to produce LiDAR point clouds of dataset objects, including intensity, and with an extensive control of the generation via conditioning information. Our experiments on nuScenes and KITTI-360 show the quality of our generations measured with new 3D metrics developed to suit LiDAR objects. The code is available at https://github.com/valeoai/LOGen.

LOGen: Toward Lidar Object Generation by Point Diffusion

TL;DR

<3-5 sentence high-level summary> LOGen introduces a diffusion-based transformer tailored for LiDAR object generation, enabling controllable synthesis of object-centered 3D point clouds with intensity. It uses a novel LiDAR object parameterization, PointNet-based point embeddings, and conditioning on viewpoint and distance to produce realistic objects, evaluated on nuScenes and KITTI-360 with new LiDAR-specific metrics. The results show LOGen achieves superior object-level fidelity and improves scene-level semantics and downstream segmentation, including cross-dataset augmentation benefits. This work establishes a first standard for LiDAR object generation and suggests future directions for speed, memory efficiency, and broader applications in simulation and data augmentation.

Abstract

The generation of LiDAR scans is a growing topic with diverse applications to autonomous driving. However, scan generation remains challenging, especially when compared to the rapid advancement of image and 3D object generation. We consider the task of LiDAR object generation, requiring models to produce 3D objects as viewed by a LiDAR scan. It focuses LiDAR scan generation on a key aspect of scenes, the objects, while also benefiting from advancements in 3D object generative methods. We introduce a novel diffusion-based model to produce LiDAR point clouds of dataset objects, including intensity, and with an extensive control of the generation via conditioning information. Our experiments on nuScenes and KITTI-360 show the quality of our generations measured with new 3D metrics developed to suit LiDAR objects. The code is available at https://github.com/valeoai/LOGen.

Paper Structure

This paper contains 56 sections, 13 equations, 9 figures, 19 tables.

Figures (9)

  • Figure 1: Novel LiDAR objects generated by LOGen trained on the nuScenes train set, produced using the conditioning information of real objects from the nuScenes validation set. Relative to the sensor, $rot$ is the object's rotation while $d$ is the distance. Point color is according to LiDAR intensity. More examples are in the appendix.
  • Figure 2: Architecture of LOGen and baselines PixArt-L and DiT-3DL.
  • Figure 3: Generations with intensities. For each pair, the left object is from the nuScenes val set, the right object is generated using the same conditioning and number of points.
  • Figure 4: Quantitative & qualitative comparison of LiDAR objects on KITTI-360 kitti360. (Top) Evaluation on 1000 sampled objects (for LOGen) or detected objects (for others) using Chamfer Distance (CD) and Jensen-Shannon Divergence (JSD). (Bottom) Examples of generated objects from KITTI-360 for person and car. Classwise results of LOGen and additional metrics EMD, FPD, and KPD are in the appendix.
  • Figure 5: Comparisons of real objects and generated output for all ten classes. Note that the LOGen is able to both capture the LiDAR pattern and generate objects of varying scales and shapes. Other models produce outputs with a degraded LiDAR pattern or do not generate coherent examples of rare classes such as bikes, motorcycles and trucks.
  • ...and 4 more figures