Table of Contents
Fetching ...

Diverse Part Synthesis for 3D Shape Creation

Yanran Guan, Oliver van Kaick

TL;DR

This work tackles diverse part-based 3D shape synthesis by enabling multiple, distinct part suggestions for incremental assembly. It proposes a two-network architecture: PCN, which encodes parts and predicts their placement, and PSN, which, conditioned on a partial assembly, generates multiple latent codes for new parts using multimodal models such as MDN, cGAN, cIMLE, and cDDPM within a unified implicit-decoder framework. Through extensive qualitative and quantitative evaluation on ShapeNet and PML data, the study demonstrates that conditional IMLE (cIMLE) and conditional DDPM (cDDPM) offer the best trade-offs for diversity and visual fidelity, outperforming GAN and MDN baselines. The results show improved user-controlled shape refinement with high reconstruction fidelity, highlighting the approach's potential for interactive 3D content creation and shape exploration. The paper also discusses limitations in part placement accuracy and data diversity, suggesting future work on richer part hierarchies and broader datasets."

Abstract

Methods that use neural networks for synthesizing 3D shapes in the form of a part-based representation have been introduced over the last few years. These methods represent shapes as a graph or hierarchy of parts and enable a variety of applications such as shape sampling and reconstruction. However, current methods do not allow easily regenerating individual shape parts according to user preferences. In this paper, we investigate techniques that allow the user to generate multiple, diverse suggestions for individual parts. Specifically, we experiment with multimodal deep generative models that allow sampling diverse suggestions for shape parts and focus on models which have not been considered in previous work on shape synthesis. To provide a comparative study of these techniques, we introduce a method for synthesizing 3D shapes in a part-based representation and evaluate all the part suggestion techniques within this synthesis method. In our method, which is inspired by previous work, shapes are represented as a set of parts in the form of implicit functions which are then positioned in space to form the final shape. Synthesis in this representation is enabled by a neural network architecture based on an implicit decoder and a spatial transformer. We compare the various multimodal generative models by evaluating their performance in generating part suggestions. Our contribution is to show with qualitative and quantitative evaluations which of the new techniques for multimodal part generation perform the best and that a synthesis method based on the top-performing techniques allows the user to more finely control the parts that are generated in the 3D shapes while maintaining high shape fidelity when reconstructing shapes.

Diverse Part Synthesis for 3D Shape Creation

TL;DR

This work tackles diverse part-based 3D shape synthesis by enabling multiple, distinct part suggestions for incremental assembly. It proposes a two-network architecture: PCN, which encodes parts and predicts their placement, and PSN, which, conditioned on a partial assembly, generates multiple latent codes for new parts using multimodal models such as MDN, cGAN, cIMLE, and cDDPM within a unified implicit-decoder framework. Through extensive qualitative and quantitative evaluation on ShapeNet and PML data, the study demonstrates that conditional IMLE (cIMLE) and conditional DDPM (cDDPM) offer the best trade-offs for diversity and visual fidelity, outperforming GAN and MDN baselines. The results show improved user-controlled shape refinement with high reconstruction fidelity, highlighting the approach's potential for interactive 3D content creation and shape exploration. The paper also discusses limitations in part placement accuracy and data diversity, suggesting future work on richer part hierarchies and broader datasets."

Abstract

Methods that use neural networks for synthesizing 3D shapes in the form of a part-based representation have been introduced over the last few years. These methods represent shapes as a graph or hierarchy of parts and enable a variety of applications such as shape sampling and reconstruction. However, current methods do not allow easily regenerating individual shape parts according to user preferences. In this paper, we investigate techniques that allow the user to generate multiple, diverse suggestions for individual parts. Specifically, we experiment with multimodal deep generative models that allow sampling diverse suggestions for shape parts and focus on models which have not been considered in previous work on shape synthesis. To provide a comparative study of these techniques, we introduce a method for synthesizing 3D shapes in a part-based representation and evaluate all the part suggestion techniques within this synthesis method. In our method, which is inspired by previous work, shapes are represented as a set of parts in the form of implicit functions which are then positioned in space to form the final shape. Synthesis in this representation is enabled by a neural network architecture based on an implicit decoder and a spatial transformer. We compare the various multimodal generative models by evaluating their performance in generating part suggestions. Our contribution is to show with qualitative and quantitative evaluations which of the new techniques for multimodal part generation perform the best and that a synthesis method based on the top-performing techniques allows the user to more finely control the parts that are generated in the 3D shapes while maintaining high shape fidelity when reconstructing shapes.
Paper Structure (34 sections, 15 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 34 sections, 15 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: Our method for incremental shape synthesis based on diverse part suggestions. Starting from an initial part (in gray), our method iteratively synthesizes new part suggestions and connects them to the existing part(s) to compose a coherent shape. Various, distinct part suggestions are proposed at each iteration (each column), and the user selects one part to proceed to the next iteration (top row). Note that the geometry of the parts is also synthesized by the method.
  • Figure 2: Our method for 3D shape synthesis with part suggestions comprises two main deep neural networks. During the training phase, the PCN receives a set of normalized and transformed shape parts. The PCN first encodes the normalized shape parts into a latent vector $\mathbf{z}$ and then learns the affine transformation parameters that transform the normalized parts to their target positions in order to compose a coherent shape. The PSN receives a partial assembly and learns to generate the latent vector $\hat{\mathbf{z}}$ that represents the shape part complementing the partial assembly. After the networks are trained, given a partial assembly, we use the PSN to generate a set of latent samples and pass them to an implicit decoder to produce the suggested parts. Then, we use the PCN to predict the affine transformations to connect the suggested parts to the partial assembly.
  • Figure 3: Illustration of how IMLE prevents mode collapse: (a) The common training scheme for generative models maps each training sample $\hat{\mathbf{x}}_j$ to the closest real sample $\mathbf{x}_i$, which can lead to some modes ($\mathbf{x}_i$) not being represented by any sample; (b) IMLE prevents mode collapse by ensuring a mapping from each real sample $\mathbf{x}_i$ to at least one generated sample $\hat{\mathbf{x}}_j$.
  • Figure 4: Comparison of the parts suggested by MDN, cGAN, cIMLE, and cDDPM, given the initial parts in gray on the left.
  • Figure 5: Examples of shapes synthesized with cIMLE through part suggestion. Starting from an initial part (in gray at the top row), our system incrementally suggests possible parts to complement the current partial shape, until a predefined number of iterations (e.g., $4$) is reached. At each iteration, the user can select one of the suggested parts and connect it to the existing part(s). In this manner, various shapes can be synthesized. We can see that the synthesized shapes differ from their nearest neighbors in the training set (in gray at the bottom row).
  • ...and 5 more figures