Table of Contents
Fetching ...

ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

Jasmine Collins, Shubham Goel, Kenan Deng, Achleshwar Luthra, Leon Xu, Erhan Gundogdu, Xi Zhang, Tomas F. Yago Vicente, Thomas Dideriksen, Himanshu Arora, Matthieu Guillaumin, Jitendra Malik

TL;DR

The paper introduces ABO, a large-scale dataset linking real-world product imagery with high-quality artist-created 3D meshes and physically-based materials to benchmark real-world 3D understanding. It evaluates three tasks—single-view 3D reconstruction, material estimation, and multi-view cross-domain retrieval—revealing significant domain gaps when transferring from synthetic datasets and demonstrating the value of multi-view data for SV-BRDF estimation. Key contributions include a comprehensive ABO data release with rich metadata, automated 6-DOF pose annotations, a photorealistic material dataset, and a challenging MVR benchmark with geometry-aware evaluation. The results highlight the limitations of ShapeNet-trained models on real-world objects and establish ABO as a catalyst for more realistic 3D object understanding research with practical implications for vision, rendering, and robotics.

Abstract

We introduce Amazon Berkeley Objects (ABO), a new large-scale dataset designed to help bridge the gap between real and virtual 3D worlds. ABO contains product catalog images, metadata, and artist-created 3D models with complex geometries and physically-based materials that correspond to real, household objects. We derive challenging benchmarks that exploit the unique properties of ABO and measure the current limits of the state-of-the-art on three open problems for real-world 3D object understanding: single-view 3D reconstruction, material estimation, and cross-domain multi-view object retrieval.

ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

TL;DR

The paper introduces ABO, a large-scale dataset linking real-world product imagery with high-quality artist-created 3D meshes and physically-based materials to benchmark real-world 3D understanding. It evaluates three tasks—single-view 3D reconstruction, material estimation, and multi-view cross-domain retrieval—revealing significant domain gaps when transferring from synthetic datasets and demonstrating the value of multi-view data for SV-BRDF estimation. Key contributions include a comprehensive ABO data release with rich metadata, automated 6-DOF pose annotations, a photorealistic material dataset, and a challenging MVR benchmark with geometry-aware evaluation. The results highlight the limitations of ShapeNet-trained models on real-world objects and establish ABO as a catalyst for more realistic 3D object understanding research with practical implications for vision, rendering, and robotics.

Abstract

We introduce Amazon Berkeley Objects (ABO), a new large-scale dataset designed to help bridge the gap between real and virtual 3D worlds. ABO contains product catalog images, metadata, and artist-created 3D models with complex geometries and physically-based materials that correspond to real, household objects. We derive challenging benchmarks that exploit the unique properties of ABO and measure the current limits of the state-of-the-art on three open problems for real-world 3D object understanding: single-view 3D reconstruction, material estimation, and cross-domain multi-view object retrieval.

Paper Structure

This paper contains 15 sections, 1 equation, 12 figures, 9 tables.

Figures (12)

  • Figure 1: ABO is a dataset of product images and realistic, high-resolution, physically-based 3D models of household objects. We use ABO to benchmark the performance of state-of-the-art methods on a variety of realistic object understanding tasks.
  • Figure 2: Posed 3D models in catalog images. We use instance masks to automatically generate 6-DOF pose annotations.
  • Figure 3: Sample catalog images and attributes that accompany ABO objects. Each object has up to 18 attribute annotations.
  • Figure 5: Qualitative 3D reconstruction results for R2N2, Occupancy Networks, GenRe, and Mesh-RCNN on ABO. All methods are pre-trained on ShapeNet and show a decrease in performance on objects from ABO.
  • Figure 6: Qualitative material estimation results for single-view (SV-net) and multi-view (MV-net) networks. We show estimated SV-BRDF properties (base color, roughness, metallicness, surface normals) for each input view of an object compared to the ground truth.
  • ...and 7 more figures