Occupancy Networks: Learning 3D Reconstruction in Function Space
Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, Andreas Geiger
TL;DR
This work introduces Occupancy Networks, a continuous occupancy-based representation for 3D geometry that models the occupancy function with a neural network conditioned on input observations. The surface is captured by the network's decision boundary, enabling infinite-resolution meshes with fixed memory, and an adaptive multiresolution iso-surface extraction (MISE) procedure to efficiently recover high-quality surfaces. The authors demonstrate competitive or superior performance across single-image reconstruction, point-cloud completion, voxel super-resolution, and unconditional mesh generation on ShapeNet, with notable memory efficiency and robustness to topology. They also provide thorough ablations on sampling strategies and architecture, showing uniform sampling and a ResNet-based decoder with conditional batch normalization yield the best results. Overall, occupancy networks offer a flexible, scalable framework for learning-based 3D reconstruction that preserves detail while avoiding discretization bottlenecks.
Abstract
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.
