Table of Contents
Fetching ...

RS2AD: End-to-End Autonomous Driving Data Generation from Roadside Sensor Observations

Ruidan Xing, Runyi Huang, Qing Xu, Lei He

TL;DR

RS2AD addresses the data bottleneck in end-to-end autonomous driving by reconstructing vehicle-mounted LiDAR data from roadside observations. It maps roadside LiDAR into the target vehicle's coordinate frame using a rigid-body transform with $R$ and $T$, then generates high-fidelity vehicle-mounted data via a virtual LiDAR ray-tracing model and plane fitting in view frustums. The approach enables all-weather, cross-model data synthesis and mitigates sim-to-real gaps, allowing the RS2V-L dataset to augment KITTI for training. Experiments show that mixing RS2V-L with KITTI improves 3D object detection AP and markedly enhances performance in complex scenarios, while achieving over tenfold gains in data-generation efficiency.

Abstract

End-to-end autonomous driving solutions, which process multi-modal sensory data to directly generate refined control commands, have become a dominant paradigm in autonomous driving research. However, these approaches predominantly depend on single-vehicle data collection for model training and optimization, resulting in significant challenges such as high data acquisition and annotation costs, the scarcity of critical driving scenarios, and fragmented datasets that impede model generalization. To mitigate these limitations, we introduce RS2AD, a novel framework for reconstructing and synthesizing vehicle-mounted LiDAR data from roadside sensor observations. Specifically, our method transforms roadside LiDAR point clouds into the vehicle-mounted LiDAR coordinate system by leveraging the target vehicle's relative pose. Subsequently, high-fidelity vehicle-mounted LiDAR data is synthesized through virtual LiDAR modeling, point cloud classification, and resampling techniques. To the best of our knowledge, this is the first approach to reconstruct vehicle-mounted LiDAR data from roadside sensor inputs. Extensive experimental evaluations demonstrate that incorporating the data generated by the RS2AD method (the RS2V-L dataset) into model training as a supplement to the KITTI dataset can significantly enhance the accuracy of 3D object detection and greatly improve the efficiency of end-to-end autonomous driving data generation. These findings strongly validate the effectiveness of the proposed method and underscore its potential in reducing dependence on costly vehicle-mounted data collection while improving the robustness of autonomous driving models.

RS2AD: End-to-End Autonomous Driving Data Generation from Roadside Sensor Observations

TL;DR

RS2AD addresses the data bottleneck in end-to-end autonomous driving by reconstructing vehicle-mounted LiDAR data from roadside observations. It maps roadside LiDAR into the target vehicle's coordinate frame using a rigid-body transform with and , then generates high-fidelity vehicle-mounted data via a virtual LiDAR ray-tracing model and plane fitting in view frustums. The approach enables all-weather, cross-model data synthesis and mitigates sim-to-real gaps, allowing the RS2V-L dataset to augment KITTI for training. Experiments show that mixing RS2V-L with KITTI improves 3D object detection AP and markedly enhances performance in complex scenarios, while achieving over tenfold gains in data-generation efficiency.

Abstract

End-to-end autonomous driving solutions, which process multi-modal sensory data to directly generate refined control commands, have become a dominant paradigm in autonomous driving research. However, these approaches predominantly depend on single-vehicle data collection for model training and optimization, resulting in significant challenges such as high data acquisition and annotation costs, the scarcity of critical driving scenarios, and fragmented datasets that impede model generalization. To mitigate these limitations, we introduce RS2AD, a novel framework for reconstructing and synthesizing vehicle-mounted LiDAR data from roadside sensor observations. Specifically, our method transforms roadside LiDAR point clouds into the vehicle-mounted LiDAR coordinate system by leveraging the target vehicle's relative pose. Subsequently, high-fidelity vehicle-mounted LiDAR data is synthesized through virtual LiDAR modeling, point cloud classification, and resampling techniques. To the best of our knowledge, this is the first approach to reconstruct vehicle-mounted LiDAR data from roadside sensor inputs. Extensive experimental evaluations demonstrate that incorporating the data generated by the RS2AD method (the RS2V-L dataset) into model training as a supplement to the KITTI dataset can significantly enhance the accuracy of 3D object detection and greatly improve the efficiency of end-to-end autonomous driving data generation. These findings strongly validate the effectiveness of the proposed method and underscore its potential in reducing dependence on costly vehicle-mounted data collection while improving the robustness of autonomous driving models.

Paper Structure

This paper contains 11 sections, 4 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The proposed method generates vehicle-mounted LiDAR data for the four corresponding vehicles in the scene based on roadside LiDAR observations. The central section of the figure illustrates the data collected by the roadside LiDAR, while the left and right sections depict the vehicle-mounted LiDAR data for the vehicles highlighted within the yellow annotation boxes.
  • Figure 2: The overall technical architecture of RS2AD.
  • Figure 3: Schematic Representation of LiDAR Ray $l_{ij}$ and Corresponding View Frustum $w_{ij}$.
  • Figure 4: Two scene examples are depicted, with the ground - truth illustrated on the far left. For each example, the first row presents the test results of distinct models (PointPillars, SECOND, PV-RCNN) trained solely on the KITTI dataset, while the SECOND row showcases the test outcomes of these models when trained on a combination of the KITTI dataset and the RS2V-L dataset. Through comparison between the two rows, a notable enhancement in detection accuracy is observable for the models trained with the mixed - dataset strategy.