Table of Contents
Fetching ...

PRISM: Probabilistic Representation for Integrated Shape Modeling and Generation

Lei Cheng, Mahdi Saleh, Qing Cheng, Lu Sang, Hongli Xu, Daniel Cremers, Federico Tombari

TL;DR

This work introduces PRISM, a novel compositional approach for 3D shape generation that integrates categorical diffusion models with Statistical Shape Models (SSM) and Gaussian Mixture Models (GMM) and uses GMM to represent part semantics in a continuous space.

Abstract

Despite the advancements in 3D full-shape generation, accurately modeling complex geometries and semantics of shape parts remains a significant challenge, particularly for shapes with varying numbers of parts. Current methods struggle to effectively integrate the contextual and structural information of 3D shapes into their generative processes. We address these limitations with PRISM, a novel compositional approach for 3D shape generation that integrates categorical diffusion models with Statistical Shape Models (SSM) and Gaussian Mixture Models (GMM). Our method employs compositional SSMs to capture part-level geometric variations and uses GMM to represent part semantics in a continuous space. This integration enables both high fidelity and diversity in generated shapes while preserving structural coherence. Through extensive experiments on shape generation and manipulation tasks, we demonstrate that our approach significantly outperforms previous methods in both quality and controllability of part-level operations. Our code will be made publicly available.

PRISM: Probabilistic Representation for Integrated Shape Modeling and Generation

TL;DR

This work introduces PRISM, a novel compositional approach for 3D shape generation that integrates categorical diffusion models with Statistical Shape Models (SSM) and Gaussian Mixture Models (GMM) and uses GMM to represent part semantics in a continuous space.

Abstract

Despite the advancements in 3D full-shape generation, accurately modeling complex geometries and semantics of shape parts remains a significant challenge, particularly for shapes with varying numbers of parts. Current methods struggle to effectively integrate the contextual and structural information of 3D shapes into their generative processes. We address these limitations with PRISM, a novel compositional approach for 3D shape generation that integrates categorical diffusion models with Statistical Shape Models (SSM) and Gaussian Mixture Models (GMM). Our method employs compositional SSMs to capture part-level geometric variations and uses GMM to represent part semantics in a continuous space. This integration enables both high fidelity and diversity in generated shapes while preserving structural coherence. Through extensive experiments on shape generation and manipulation tasks, we demonstrate that our approach significantly outperforms previous methods in both quality and controllability of part-level operations. Our code will be made publicly available.

Paper Structure

This paper contains 22 sections, 9 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Method Overview. Given a segmented 3D shape $S$ of $m$ parts, we encode the part point cloud with Statistical shape models and represent it as an unordered set. Each part set has its own SSM, the encoder and decoder are a group of the same categorical SSMs. We represent each categorical part with Gaussian embedding and Gaussian semantics embedding from the Gaussian Mixture Model. Then we train our part-level latent diffusion model on the part sets for conditional/unconditional 3D shape generation. The generated latent part set can be decoded and denoised into a clean 3D shape.
  • Figure 2: Qualitative results for unconditional shape generation. Given a ground truth shape, we retrieve the closest generated shape by evaluating the EMD for each method. Our approach yields superior results in producing complex shape structures and high levels of detail after refinement compared to other methods.
  • Figure 3: Part-level single-view reconstruction. We qualitatively compare with Part123 liu2024Part123 on the 3DCoMPaT++ dataset slim_3dcompatplus_2023. Given a single view image, our method produces 3D shape that is more consistent with the image even for the complex structures.
  • Figure 4: Qualitative comparison of text-to-shape.PRISM generates more diverse, high-quality results that align with given texts.
  • Figure 5: Cascaded part completion. Our method generates complete shapes from a sub-part input. It derives the complete part from a sub-part using corresponding SSM and applying diffusion denoising to produce a variety of plausible whole shapes.