Table of Contents
Fetching ...

Gaussian Splatting is an Effective Data Generator for 3D Object Detection

Farhad G. Zanjani, Davide Abati, Auke Wiggers, Dimitris Kalatzis, Jens Petersen, Hong Cai, Amirhossein Habibian

TL;DR

This work tackles the data bottleneck in 3D object detection for autonomous driving by introducing a 3D Gaussian Splatting–based augmentation pipeline. Unlike diffusion-based methods that implicitly condition on layouts, the approach reconstructs scenes with OmniRe and inserts new rigid agents through explicit SE(3) transformations, ensuring physically plausible placement and accurate 3D pose annotations. Empirical results on nuScenes show that 3D-GS augmentation consistently outperforms diffusion baselines in both monocular and multi-view settings, with geometric diversity driving the majority of gains and pose variation enhancing orientation accuracy. The study also finds that generating hard examples offers limited benefits, suggesting that broad geometric variation and credible geometry-based augmentation are more impactful for camera-based 3D detection in autonomous driving.

Abstract

We investigate data augmentation for 3D object detection in autonomous driving. We utilize recent advancements in 3D reconstruction based on Gaussian Splatting for 3D object placement in driving scenes. Unlike existing diffusion-based methods that synthesize images conditioned on BEV layouts, our approach places 3D objects directly in the reconstructed 3D space with explicitly imposed geometric transformations. This ensures both the physical plausibility of object placement and highly accurate 3D pose and position annotations. Our experiments demonstrate that even by integrating a limited number of external 3D objects into real scenes, the augmented data significantly enhances 3D object detection performance and outperforms existing diffusion-based 3D augmentation for object detection. Extensive testing on the nuScenes dataset reveals that imposing high geometric diversity in object placement has a greater impact compared to the appearance diversity of objects. Additionally, we show that generating hard examples, either by maximizing detection loss or imposing high visual occlusion in camera images, does not lead to more efficient 3D data augmentation for camera-based 3D object detection in autonomous driving.

Gaussian Splatting is an Effective Data Generator for 3D Object Detection

TL;DR

This work tackles the data bottleneck in 3D object detection for autonomous driving by introducing a 3D Gaussian Splatting–based augmentation pipeline. Unlike diffusion-based methods that implicitly condition on layouts, the approach reconstructs scenes with OmniRe and inserts new rigid agents through explicit SE(3) transformations, ensuring physically plausible placement and accurate 3D pose annotations. Empirical results on nuScenes show that 3D-GS augmentation consistently outperforms diffusion baselines in both monocular and multi-view settings, with geometric diversity driving the majority of gains and pose variation enhancing orientation accuracy. The study also finds that generating hard examples offers limited benefits, suggesting that broad geometric variation and credible geometry-based augmentation are more impactful for camera-based 3D detection in autonomous driving.

Abstract

We investigate data augmentation for 3D object detection in autonomous driving. We utilize recent advancements in 3D reconstruction based on Gaussian Splatting for 3D object placement in driving scenes. Unlike existing diffusion-based methods that synthesize images conditioned on BEV layouts, our approach places 3D objects directly in the reconstructed 3D space with explicitly imposed geometric transformations. This ensures both the physical plausibility of object placement and highly accurate 3D pose and position annotations. Our experiments demonstrate that even by integrating a limited number of external 3D objects into real scenes, the augmented data significantly enhances 3D object detection performance and outperforms existing diffusion-based 3D augmentation for object detection. Extensive testing on the nuScenes dataset reveals that imposing high geometric diversity in object placement has a greater impact compared to the appearance diversity of objects. Additionally, we show that generating hard examples, either by maximizing detection loss or imposing high visual occlusion in camera images, does not lead to more efficient 3D data augmentation for camera-based 3D object detection in autonomous driving.

Paper Structure

This paper contains 23 sections, 5 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Image augmentation through 3D scene reconstruction and physically plausible object placement for 3D object detection.
  • Figure 2: Overview of multi-camera 3D data augmentation through object placement in 3D field of Gaussian Splatting.
  • Figure 3: Examples of single-camera augmentation. This examples demonstrate physically plausible insertion of one agent in camera view.
  • Figure 4: Examples of pose-aligned agent placement (left) versus random pose placement (right).
  • Figure 5: Examples of agent placement with low (left) and high (right) occlusion score.
  • ...and 7 more figures