Table of Contents
Fetching ...

Automating 3D Dataset Generation with Neural Radiance Fields

P. Schulz, T. Hempel, A. Al-Hamadi

TL;DR

This work tackles the challenge of obtaining diverse, precisely annotated 3D data for $6D$ pose estimation, which is typically labor-intensive and object-specific. It introduces an end-to-end pipeline that leverages Neural Radiance Fields to synthesize high-fidelity 3D object models from 2D images and integrates them into a synthetic data generator (BlenderProc) to produce labeled datasets. The authors validate the pipeline by capturing six objects, reconstructing textured meshes with a NeuS-based NeRF approach, and training DOPE-based pose estimators for tabletop and in-hand tasks, demonstrating practical viability. The work shows that radiance-field based 3D model creation can automate dataset generation and reduce reliance on pre-built 3D models, with future directions including full object reconstruction via model fusion and stereo data expansion for quantitative benchmarking.

Abstract

3D detection is a critical task to understand spatial characteristics of the environment and is used in a variety of applications including robotics, augmented reality, and image retrieval. Training performant detection models require diverse, precisely annotated, and large scale datasets that involve complex and expensive creation processes. Hence, there are only few public 3D datasets that are additionally limited in their range of classes. In this work, we propose a pipeline for automatic generation of 3D datasets for arbitrary objects. By utilizing the universal 3D representation and rendering capabilities of Radiance Fields, our pipeline generates high quality 3D models for arbitrary objects. These 3D models serve as input for a synthetic dataset generator. Our pipeline is fast, easy to use and has a high degree of automation. Our experiments demonstrate, that 3D pose estimation networks, trained with our generated datasets, archive strong performance in typical application scenarios.

Automating 3D Dataset Generation with Neural Radiance Fields

TL;DR

This work tackles the challenge of obtaining diverse, precisely annotated 3D data for pose estimation, which is typically labor-intensive and object-specific. It introduces an end-to-end pipeline that leverages Neural Radiance Fields to synthesize high-fidelity 3D object models from 2D images and integrates them into a synthetic data generator (BlenderProc) to produce labeled datasets. The authors validate the pipeline by capturing six objects, reconstructing textured meshes with a NeuS-based NeRF approach, and training DOPE-based pose estimators for tabletop and in-hand tasks, demonstrating practical viability. The work shows that radiance-field based 3D model creation can automate dataset generation and reduce reliance on pre-built 3D models, with future directions including full object reconstruction via model fusion and stereo data expansion for quantitative benchmarking.

Abstract

3D detection is a critical task to understand spatial characteristics of the environment and is used in a variety of applications including robotics, augmented reality, and image retrieval. Training performant detection models require diverse, precisely annotated, and large scale datasets that involve complex and expensive creation processes. Hence, there are only few public 3D datasets that are additionally limited in their range of classes. In this work, we propose a pipeline for automatic generation of 3D datasets for arbitrary objects. By utilizing the universal 3D representation and rendering capabilities of Radiance Fields, our pipeline generates high quality 3D models for arbitrary objects. These 3D models serve as input for a synthetic dataset generator. Our pipeline is fast, easy to use and has a high degree of automation. Our experiments demonstrate, that 3D pose estimation networks, trained with our generated datasets, archive strong performance in typical application scenarios.

Paper Structure

This paper contains 21 sections, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Automated dataset generation with our pipeline. The process start with capturing images of an object, continues with 3D model creation and finishes with 3D dataset generation.
  • Figure 2: Automated dataset generation pipeline, each block is depicted with its corresponding phases.
  • Figure 3: Design of our Object Capturing block, each capturing phase is displayed with its in- and output. The process begins with image capturing. Structure from Motion and Foreground Extraction perform post-processing on the captured images.
  • Figure 4: Object capturing set up, we placed the object on a rotating plate and captured it with a static camera.
  • Figure 5: Design of our Model Generation block, it receives the output of the previous block as input and generates a textured 3D model.
  • ...and 5 more figures