Table of Contents
Fetching ...

3D Reconstruction of non-visible surfaces of objects from a Single Depth View -- Comparative Study

Rafał Staszak, Piotr Michałek, Jakub Chudziński, Marek Kopicki, Dominik Belter

TL;DR

This work tackles reconstructing the full geometry of objects from a single RGB-D view by contrasting two neural approaches. DeepSDF behaves in 3D space by predicting the Signed Distance Function for points with a latent object code, while MirrorNet reconstructs occluded surfaces by generating depth images from the opposite viewpoint. Results on ShapeNet categories show MirrorNet achieves substantially faster inference (≈$22$ ms) and often better Hausdorff accuracy (≈2–3× smaller than DeepSDF), with comparable Chamfer distances, though it may omit surfaces not observed from the input view. The findings suggest view-dependent depth generation offers a practical, fast alternative for single-view 3D reconstruction in robotics, with potential applications in fast grasping and manipulation.

Abstract

Scene and object reconstruction is an important problem in robotics, in particular in planning collision-free trajectories or in object manipulation. This paper compares two strategies for the reconstruction of nonvisible parts of the object surface from a single RGB-D camera view. The first method, named DeepSDF predicts the Signed Distance Transform to the object surface for a given point in 3D space. The second method, named MirrorNet reconstructs the occluded objects' parts by generating images from the other side of the observed object. Experiments performed with objects from the ShapeNet dataset, show that the view-dependent MirrorNet is faster and has smaller reconstruction errors in most categories.

3D Reconstruction of non-visible surfaces of objects from a Single Depth View -- Comparative Study

TL;DR

This work tackles reconstructing the full geometry of objects from a single RGB-D view by contrasting two neural approaches. DeepSDF behaves in 3D space by predicting the Signed Distance Function for points with a latent object code, while MirrorNet reconstructs occluded surfaces by generating depth images from the opposite viewpoint. Results on ShapeNet categories show MirrorNet achieves substantially faster inference (≈ ms) and often better Hausdorff accuracy (≈2–3× smaller than DeepSDF), with comparable Chamfer distances, though it may omit surfaces not observed from the input view. The findings suggest view-dependent depth generation offers a practical, fast alternative for single-view 3D reconstruction in robotics, with potential applications in fast grasping and manipulation.

Abstract

Scene and object reconstruction is an important problem in robotics, in particular in planning collision-free trajectories or in object manipulation. This paper compares two strategies for the reconstruction of nonvisible parts of the object surface from a single RGB-D camera view. The first method, named DeepSDF predicts the Signed Distance Transform to the object surface for a given point in 3D space. The second method, named MirrorNet reconstructs the occluded objects' parts by generating images from the other side of the observed object. Experiments performed with objects from the ShapeNet dataset, show that the view-dependent MirrorNet is faster and has smaller reconstruction errors in most categories.

Paper Structure

This paper contains 4 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Example application scenario of the objects reconstruction system: the robot observes a 3D object from a single view (a). The incomplete model of the object (point cloud) (b) is provided to the input of the neural network to obtain a full model of the object (c).
  • Figure 2: Example reconstruction results (point clouds) obtained for the view-dependent MirrorNet (top row) and DeepSDF (bottom row) compared to the ground truth models: bottle, mug, can, and laptop.