Table of Contents
Fetching ...

D-Aug: Enhancing Data Augmentation for Dynamic LiDAR Scenes

Jiaxing Zhao, Peng Zheng, Rui Ma

TL;DR

The paper tackles the data labeling burden for LiDAR in autonomous driving, focusing on dynamic scenes. It introduces D-Aug, a dynamic-scene LiDAR augmentation method that extracts objects from one scene and inserts them into others with temporal continuity, guided by pixel-level road identification and a reference-guided insertion strategy that includes dynamic collision checks and rotation alignment. Key contributions include the pixel-level road identification technique, precise object extraction via global-coordinate transformation, and a reference-guided insertion mechanism that preserves scene realism across frames; these are validated on the nuScenes dataset showing improvements in $mAP$ and $NDS$ for 3D detection and in $AMOTA$ for 3D tracking. The approach reduces labeling costs and enhances model performance on dynamic scenes, with practical impact on autonomous driving perception, though occlusion during insertion remains a challenge to address in future work.

Abstract

Creating large LiDAR datasets with pixel-level labeling poses significant challenges. While numerous data augmentation methods have been developed to reduce the reliance on manual labeling, these methods predominantly focus on static scenes and they overlook the importance of data augmentation for dynamic scenes, which is critical for autonomous driving. To address this issue, we propose D-Aug, a LiDAR data augmentation method tailored for augmenting dynamic scenes. D-Aug extracts objects and inserts them into dynamic scenes, considering the continuity of these objects across consecutive frames. For seamless insertion into dynamic scenes, we propose a reference-guided method that involves dynamic collision detection and rotation alignment. Additionally, we present a pixel-level road identification strategy to efficiently determine suitable insertion positions. We validated our method using the nuScenes dataset with various 3D detection and tracking methods. Comparative experiments demonstrate the superiority of D-Aug.

D-Aug: Enhancing Data Augmentation for Dynamic LiDAR Scenes

TL;DR

The paper tackles the data labeling burden for LiDAR in autonomous driving, focusing on dynamic scenes. It introduces D-Aug, a dynamic-scene LiDAR augmentation method that extracts objects from one scene and inserts them into others with temporal continuity, guided by pixel-level road identification and a reference-guided insertion strategy that includes dynamic collision checks and rotation alignment. Key contributions include the pixel-level road identification technique, precise object extraction via global-coordinate transformation, and a reference-guided insertion mechanism that preserves scene realism across frames; these are validated on the nuScenes dataset showing improvements in and for 3D detection and in for 3D tracking. The approach reduces labeling costs and enhances model performance on dynamic scenes, with practical impact on autonomous driving perception, though occlusion during insertion remains a challenge to address in future work.

Abstract

Creating large LiDAR datasets with pixel-level labeling poses significant challenges. While numerous data augmentation methods have been developed to reduce the reliance on manual labeling, these methods predominantly focus on static scenes and they overlook the importance of data augmentation for dynamic scenes, which is critical for autonomous driving. To address this issue, we propose D-Aug, a LiDAR data augmentation method tailored for augmenting dynamic scenes. D-Aug extracts objects and inserts them into dynamic scenes, considering the continuity of these objects across consecutive frames. For seamless insertion into dynamic scenes, we propose a reference-guided method that involves dynamic collision detection and rotation alignment. Additionally, we present a pixel-level road identification strategy to efficiently determine suitable insertion positions. We validated our method using the nuScenes dataset with various 3D detection and tracking methods. Comparative experiments demonstrate the superiority of D-Aug.
Paper Structure (18 sections, 1 equation, 4 figures, 6 tables)

This paper contains 18 sections, 1 equation, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Illustration of augmented LiDAR data. The three figures in the first row as well as those figures in the second row , display the augmented point clouds for three successive frames from two distinct scenes. The orange and green bounding boxes represent the original and inserted objects, respectively. Notably, the areas within the red rectangles emphasize the relative movement between the inserted objects and the stationary obstacles (grey boxes).
  • Figure 2: Overview of D-Aug. Consecutive point cloud frames A are processed: available insertion positions are first determined through road identification. Subsequently, objects are extracted from point cloud B by calculating direction vectors. Finally, these extracted objects are inserted into each frame of A using a reference-guided insertion approach, which incorporates dynamic collision detection and rotation alignment. Notably, the inserted objects are rotated to align with the traffic flow in the dynamic scene, resulting in augmented dynamic scenes.
  • Figure 3: Illustration of pixel-level road identification. The map is cropped, pixelized, and rotated to facilitate road identification based on pixel values. The grey areas represent the roads where the objects can be inserted.
  • Figure 4: Illustration of object extraction. Points within a bounding box are identified by calculating direction vectors. Points sharing the same direction vectors as the center point of the bounding box are extracted as objects.