Table of Contents
Fetching ...

Occupancy Networks: Learning 3D Reconstruction in Function Space

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, Andreas Geiger

TL;DR

This work introduces Occupancy Networks, a continuous occupancy-based representation for 3D geometry that models the occupancy function with a neural network conditioned on input observations. The surface is captured by the network's decision boundary, enabling infinite-resolution meshes with fixed memory, and an adaptive multiresolution iso-surface extraction (MISE) procedure to efficiently recover high-quality surfaces. The authors demonstrate competitive or superior performance across single-image reconstruction, point-cloud completion, voxel super-resolution, and unconditional mesh generation on ShapeNet, with notable memory efficiency and robustness to topology. They also provide thorough ablations on sampling strategies and architecture, showing uniform sampling and a ResNet-based decoder with conditional batch normalization yield the best results. Overall, occupancy networks offer a flexible, scalable framework for learning-based 3D reconstruction that preserves detail while avoiding discretization bottlenecks.

Abstract

With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

Occupancy Networks: Learning 3D Reconstruction in Function Space

TL;DR

This work introduces Occupancy Networks, a continuous occupancy-based representation for 3D geometry that models the occupancy function with a neural network conditioned on input observations. The surface is captured by the network's decision boundary, enabling infinite-resolution meshes with fixed memory, and an adaptive multiresolution iso-surface extraction (MISE) procedure to efficiently recover high-quality surfaces. The authors demonstrate competitive or superior performance across single-image reconstruction, point-cloud completion, voxel super-resolution, and unconditional mesh generation on ShapeNet, with notable memory efficiency and robustness to topology. They also provide thorough ablations on sampling strategies and architecture, showing uniform sampling and a ResNet-based decoder with conditional batch normalization yield the best results. Overall, occupancy networks offer a flexible, scalable framework for learning-based 3D reconstruction that preserves detail while avoiding discretization bottlenecks.

Abstract

With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

Paper Structure

This paper contains 17 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Overview: Existing 3D representations discretize the output space differently: (\ref{['fig:teaser_voxel']}) spatially in voxel representations, (\ref{['fig:teaser_point']}) in terms of predicted points, and (\ref{['fig:teaser_mesh']}) in terms of vertices for mesh representations. In contrast, (\ref{['fig:teaser_ours']}) we propose to consider the continuous decision boundary of a classifier $f_\theta$ (e.g., a deep neural network) as a 3D surface which allows to extract 3D meshes at any resolution.
  • Figure 2: Multiresolution IsoSurface Extraction: We first mark all points at a given resolution which have already been evaluated as either occupied (red circles) or unoccupied (cyan diamonds). We then determine all voxels that have both occupied and unoccupied corners and mark them as active (light red) and subdivide them into 8 subvoxels each. Next, we evaluate all new grid points (empty circles) that have been introduced by the subdivision. The previous two steps are repeated until the desired output resolution is reached. Finally we extract the mesh using the marching cubes algorithm Lorensen1987SIGGRAPH, simplify and refine the output mesh using first and second order gradient information.
  • Figure 3: Discrete vs. Continuous. Qualitative comparison of our continuous representation (right) to voxelizations at various resolutions (left). Note how our representation encodes details which are lost in voxel-based representations.
  • Figure 4: IoU vs. Resolution. This plot shows the IoU of a voxelization to the ground truth mesh (solid blue line) in comparison to our continuous representation (solid orange line) as well as the number of parameters per model needed for the two representations (dashed lines). Note how our representation leads to larger IoU wrt. the ground truth mesh compared to a low-resolution voxel representation. At the same time, the number of parameters of a voxel representation grows cubically with the resolution, whereas the number of parameters of occupancy networks is independent of the resolution.
  • Figure 5: Single Image 3D Reconstruction. The input image is shown in the first column, the other columns show the results for our method compared to various baselines.
  • ...and 2 more figures