Table of Contents
Fetching ...

NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields

Junge Zhang, Feihu Zhang, Shaochen Kuang, Li Zhang

TL;DR

NeRF-LiDAR addresses the high cost of labeling LiDAR data for autonomous driving by learning a NeRF-based implicit representation of real-world driving scenes from multi-view images and sparse LiDAR, and then rendering realistic LiDAR point clouds with per-point semantic labels. It integrates a NeRF reconstruction module with a generation pipeline that includes raydrop, equirectangular projection, and feature- and point-level alignments, guided by weak 2D labels and limited 3D annotations. Experiments on nuScenes show that models trained on NeRF-LiDAR data achieve competitive or superior performance to those trained on real data, with substantial gains from pre-training on simulated data and effective fine-tuning with limited real data. The approach enables realistic multi-sensor simulation, supports new sensor configurations, and yields efficient rendering, offering a practical path to data-efficient 3D perception for autonomous systems.

Abstract

Labeling LiDAR point clouds for training autonomous driving is extremely expensive and difficult. LiDAR simulation aims at generating realistic LiDAR data with labels for training and verifying self-driving algorithms more efficiently. Recently, Neural Radiance Fields (NeRF) have been proposed for novel view synthesis using implicit reconstruction of 3D scenes. Inspired by this, we present NeRF-LIDAR, a novel LiDAR simulation method that leverages real-world information to generate realistic LIDAR point clouds. Different from existing LiDAR simulators, we use real images and point cloud data collected by self-driving cars to learn the 3D scene representation, point cloud generation and label rendering. We verify the effectiveness of our NeRF-LiDAR by training different 3D segmentation models on the generated LiDAR point clouds. It reveals that the trained models are able to achieve similar accuracy when compared with the same model trained on the real LiDAR data. Besides, the generated data is capable of boosting the accuracy through pre-training which helps reduce the requirements of the real labeled data.

NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields

TL;DR

NeRF-LiDAR addresses the high cost of labeling LiDAR data for autonomous driving by learning a NeRF-based implicit representation of real-world driving scenes from multi-view images and sparse LiDAR, and then rendering realistic LiDAR point clouds with per-point semantic labels. It integrates a NeRF reconstruction module with a generation pipeline that includes raydrop, equirectangular projection, and feature- and point-level alignments, guided by weak 2D labels and limited 3D annotations. Experiments on nuScenes show that models trained on NeRF-LiDAR data achieve competitive or superior performance to those trained on real data, with substantial gains from pre-training on simulated data and effective fine-tuning with limited real data. The approach enables realistic multi-sensor simulation, supports new sensor configurations, and yields efficient rendering, offering a practical path to data-efficient 3D perception for autonomous systems.

Abstract

Labeling LiDAR point clouds for training autonomous driving is extremely expensive and difficult. LiDAR simulation aims at generating realistic LiDAR data with labels for training and verifying self-driving algorithms more efficiently. Recently, Neural Radiance Fields (NeRF) have been proposed for novel view synthesis using implicit reconstruction of 3D scenes. Inspired by this, we present NeRF-LIDAR, a novel LiDAR simulation method that leverages real-world information to generate realistic LIDAR point clouds. Different from existing LiDAR simulators, we use real images and point cloud data collected by self-driving cars to learn the 3D scene representation, point cloud generation and label rendering. We verify the effectiveness of our NeRF-LiDAR by training different 3D segmentation models on the generated LiDAR point clouds. It reveals that the trained models are able to achieve similar accuracy when compared with the same model trained on the real LiDAR data. Besides, the generated data is capable of boosting the accuracy through pre-training which helps reduce the requirements of the real labeled data.
Paper Structure (45 sections, 14 equations, 10 figures, 13 tables)

This paper contains 45 sections, 14 equations, 10 figures, 13 tables.

Figures (10)

  • Figure 1: Comparisons of results between our NeRF-LiDAR and other existing LiDAR simulation methods. (a) Method carla that creates virtual world for LiDAR simulation. (b) Diffusion model used for LiDAR generation lidargen. (c) Our NeRF-LiDAR can generate realistic point clouds that is nearly the same as the real LiDAR point clouds (d).
  • Figure 2: Schematic illustration of NeRF-LiDAR. Image sequences along with the predicted weak semantic labels are used as inputs to reconstruct the implicit NeRF model. LiDAR signals are also used to help create more accurate 3D geometry. Initial coarse point clouds are generated by the NeRF reconstruction through Eq. \ref{['eq:lidar_direction']}$\sim$\ref{['eq:point_generation']}. The initial point clouds are projected into 2D equirectangular images. We then utilize a U-Net to learn raydrop and the alignment (detailed in Fig. \ref{['fig:alignment']}) to make the generated point clouds more realistic.
  • Figure 3: Illustration of learning raydrop and alignment. The initial coarse point clouds are projected into 2D equirectangular images. We use the projected depth, RGB texture, and depth variances as input to a standard U-Net. The U-Net learns the raydrop mask to improve the initial coarse point clouds through the point-wise alignment (Eq. \ref{['eq:point_alignment']}) and the feature-level alignment (Eq. \ref{['eq:feature_alignment']}). Finally, the refined equirectangular images are back-projected to 3D space to achieve the expected LiDAR point clouds.
  • Figure 4: Comparisons of different settings for LiDAR rendering. (a) Point clouds without raydrop, (b) Point clouds after random raydrop. (c) Point clouds after our learning based raydrop but without using the feature-level alignment. (d) The final generated point clouds with both learning based raydrop and the feature-level alignment. (e) the real LiDAR point clouds.
  • Figure 5: Comparisons between the data and label generated by the NeRF-LiDAR and the real LiDAR data with human annotations. For better visualization, we project the 3D point cloud as 2D equirectangular image with colorized labels. Our NeRF-LiDAR (a) is shown able to generate accurate labels and realistic point clouds that is almost the same as the real data (b).
  • ...and 5 more figures