Diverse Part Synthesis for 3D Shape Creation
Yanran Guan, Oliver van Kaick
TL;DR
This work tackles diverse part-based 3D shape synthesis by enabling multiple, distinct part suggestions for incremental assembly. It proposes a two-network architecture: PCN, which encodes parts and predicts their placement, and PSN, which, conditioned on a partial assembly, generates multiple latent codes for new parts using multimodal models such as MDN, cGAN, cIMLE, and cDDPM within a unified implicit-decoder framework. Through extensive qualitative and quantitative evaluation on ShapeNet and PML data, the study demonstrates that conditional IMLE (cIMLE) and conditional DDPM (cDDPM) offer the best trade-offs for diversity and visual fidelity, outperforming GAN and MDN baselines. The results show improved user-controlled shape refinement with high reconstruction fidelity, highlighting the approach's potential for interactive 3D content creation and shape exploration. The paper also discusses limitations in part placement accuracy and data diversity, suggesting future work on richer part hierarchies and broader datasets."
Abstract
Methods that use neural networks for synthesizing 3D shapes in the form of a part-based representation have been introduced over the last few years. These methods represent shapes as a graph or hierarchy of parts and enable a variety of applications such as shape sampling and reconstruction. However, current methods do not allow easily regenerating individual shape parts according to user preferences. In this paper, we investigate techniques that allow the user to generate multiple, diverse suggestions for individual parts. Specifically, we experiment with multimodal deep generative models that allow sampling diverse suggestions for shape parts and focus on models which have not been considered in previous work on shape synthesis. To provide a comparative study of these techniques, we introduce a method for synthesizing 3D shapes in a part-based representation and evaluate all the part suggestion techniques within this synthesis method. In our method, which is inspired by previous work, shapes are represented as a set of parts in the form of implicit functions which are then positioned in space to form the final shape. Synthesis in this representation is enabled by a neural network architecture based on an implicit decoder and a spatial transformer. We compare the various multimodal generative models by evaluating their performance in generating part suggestions. Our contribution is to show with qualitative and quantitative evaluations which of the new techniques for multimodal part generation perform the best and that a synthesis method based on the top-performing techniques allows the user to more finely control the parts that are generated in the 3D shapes while maintaining high shape fidelity when reconstructing shapes.
