Table of Contents
Fetching ...

Learning Implicit Fields for Generative Shape Modeling

Zhiqin Chen, Hao Zhang

TL;DR

The paper introduces IM-NET, an implicit field decoder that represents shapes as continuous inside/outside fields learned from point queries, enabling high-quality, resolution-independent surface extraction via iso-surfaces. By embedding IM-NET into autoencoder and GAN frameworks (IM-AE, IM-GAN), the authors demonstrate improved surface quality, better topology handling, and versatile capabilities across 3D/2D generation, interpolation, and single-view reconstruction. They emphasize the limitations of voxel/CNN-based decoders for visual quality and propose LFD as a more perceptually aligned metric for evaluation. Overall, IM-NET provides a lightweight yet powerful alternative for generative shape modeling with broad applicability and clear qualitative gains, albeit with training-time and sampling-speed trade-offs that warrant further optimization.

Abstract

We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder, called IM-NET, for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. IM-NET is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our implicit decoder for representation learning (via IM-AE) and shape generation (via IM-GAN), we demonstrate superior results for tasks such as generative shape modeling, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality. Code and supplementary material are available at https://github.com/czq142857/implicit-decoder.

Learning Implicit Fields for Generative Shape Modeling

TL;DR

The paper introduces IM-NET, an implicit field decoder that represents shapes as continuous inside/outside fields learned from point queries, enabling high-quality, resolution-independent surface extraction via iso-surfaces. By embedding IM-NET into autoencoder and GAN frameworks (IM-AE, IM-GAN), the authors demonstrate improved surface quality, better topology handling, and versatile capabilities across 3D/2D generation, interpolation, and single-view reconstruction. They emphasize the limitations of voxel/CNN-based decoders for visual quality and propose LFD as a more perceptually aligned metric for evaluation. Overall, IM-NET provides a lightweight yet powerful alternative for generative shape modeling with broad applicability and clear qualitative gains, albeit with training-time and sampling-speed trade-offs that warrant further optimization.

Abstract

We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder, called IM-NET, for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. IM-NET is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our implicit decoder for representation learning (via IM-AE) and shape generation (via IM-GAN), we demonstrate superior results for tasks such as generative shape modeling, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality. Code and supplementary material are available at https://github.com/czq142857/implicit-decoder.

Paper Structure

This paper contains 14 sections, 2 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: 3D shapes generated by IM-GAN, our implicit field generative adversarial network, which was trained on $64^3$ or $128^3$ voxelized shapes. The output shapes are sampled at $512^3$ resolution and rendered after Marching Cubes.
  • Figure 2: Network structure of our implicit decoder, IM-NET. The network takes as input a feature vector extracted by a shape encoder, as well as a 3D or 2D point coordinate, and it returns a value indicating the inside/outside status of the point relative to the shape. The encoder can be a CNN or use PointNET pointnet, depending on the application.
  • Figure 3: CNN-based decoder vs. our implicit decoder. We trained two autoencoders with CNN decoder ($\text{AE}_{\text{CNN}}$) and our implicit decoder ($\text{AE}_{\text{IM}}$), respectively, on a synthesized dataset of letter A's on white background. The two models have the same CNN encoder. (a) and (b) show the sampled images during AE training. (c) and (d) show interpolation sequences produced by the two trained AEs. See more comparisons in the supplementary material.
  • Figure 4: Visual results for 3D reconstruction. Each column presents one example from one category. IM-AE64 is sampled on $64^3$ resolution and IM-AE256 on $256^3$. All results are rendered using the same Marching Cubes setup.
  • Figure 5: 3D shape interpolation results. 3DGAN, CNN-GAN, and IM-GAN are sampled at $64^3$ resolution to show the smoothness of the surface is not just a matter of sampling resolution. Notice that the morphing sequence of IM-GAN not only consists of smooth part movements (legs, board), but also handles topology changes.
  • ...and 3 more figures