Table of Contents
Fetching ...

R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding

Qirui Wu, Sonia Raychaudhuri, Daniel Ritchie, Manolis Savva, Angel X Chang

TL;DR

The Reality-linked 3D Scenes (R3DS) dataset of synthetic 3D scenes mirroring the real-world scene arrangements from Matterport3D panoramas is introduced and R3DS offers a challenging benchmark for future work on panoramic scene understanding.

Abstract

We introduce the Reality-linked 3D Scenes (R3DS) dataset of synthetic 3D scenes mirroring the real-world scene arrangements from Matterport3D panoramas. Compared to prior work, R3DS has more complete and densely populated scenes with objects linked to real-world observations in panoramas. R3DS also provides an object support hierarchy, and matching object sets (e.g., same chairs around a dining table) for each scene. Overall, R3DS contains 19K objects represented by 3,784 distinct CAD models from over 100 object categories. We demonstrate the effectiveness of R3DS on the Panoramic Scene Understanding task. We find that: 1) training on R3DS enables better generalization; 2) support relation prediction trained with R3DS improves performance compared to heuristically calculated support; and 3) R3DS offers a challenging benchmark for future work on panoramic scene understanding.

R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding

TL;DR

The Reality-linked 3D Scenes (R3DS) dataset of synthetic 3D scenes mirroring the real-world scene arrangements from Matterport3D panoramas is introduced and R3DS offers a challenging benchmark for future work on panoramic scene understanding.

Abstract

We introduce the Reality-linked 3D Scenes (R3DS) dataset of synthetic 3D scenes mirroring the real-world scene arrangements from Matterport3D panoramas. Compared to prior work, R3DS has more complete and densely populated scenes with objects linked to real-world observations in panoramas. R3DS also provides an object support hierarchy, and matching object sets (e.g., same chairs around a dining table) for each scene. Overall, R3DS contains 19K objects represented by 3,784 distinct CAD models from over 100 object categories. We demonstrate the effectiveness of R3DS on the Panoramic Scene Understanding task. We find that: 1) training on R3DS enables better generalization; 2) support relation prediction trained with R3DS improves performance compared to heuristically calculated support; and 3) R3DS offers a challenging benchmark for future work on panoramic scene understanding.
Paper Structure (11 sections, 26 figures, 7 tables)

This paper contains 11 sections, 26 figures, 7 tables.

Figures (26)

  • Figure 1: Left: the Reality-linked 3D Scenes dataset (R3DS) fills a gap between synthetic 3D scenes and reconstructions of real-world environments by providing 3D scene proxies linked to real-world panoramas from Matterport3D (three example panoramas and 3D scenes shown). Right: our dataset contains scenes with higher density and completeness compared to prior datasets, and provides additional annotations such as object support (what objects or architectural elements support other objects), and matching object sets (e.g., pairs of the same nightstand). We use our dataset for the panoramic scene understanding task and demonstrate its value for research on room layout estimation, as well as 2D and 3D object detection.
  • Figure 2: Dataset comparison. (Top) shows different views of a scene annotated in R3DS. Comparison with previous datasets (bottom) shows (1) R3DS has more complete scenes than the previous datasets; (2) Objects in R3DS are properly supported by either architecture or other objects unlike the others (e.g. floating objects with no proper support); (3) R3DS is annotated using the same 3D model for objects arranged together (chairs by the dining table, couches arranged together).
  • Figure 3: R3DS annotation pipeline. Annotators see an empty scene (architecture only). They then insert and manipulate 3D object models from a panorama viewpoint to create a populated 3D scene proxy corresponding to the panorama.
  • Figure 4: Architecture comparison. Compared to Scan2CAD (no architecture) and CAD-Estate (partial architecture), R3DS provides complete architecture with door/window portals.
  • Figure 5: Qualitative results for cross-dataset Panoramic Scene Understanding task. Correct and incorrect object detections shown in green and red boxes. Ground truth room layout and meshes shown in gray color, while layout prediction is in yellow. Training on R3DS leads to fewer errors compared to training on other datasets, especially when mixed with real data.
  • ...and 21 more figures