Table of Contents
Fetching ...

UniPart: Part-Level 3D Generation with Unified 3D Geom-Seg Latents

Xufan He, Yushuang Wu, Xiaoyang Guo, Chongjie Ye, Jiaqing Zhou, Tianlei Hu, Xiaoguang Han, Dong Du

TL;DR

This work tackles the challenge of part-level 3D generation by learning part-aware structure directly from whole-object geometry. It introduces Geom-Seg VecSet, a unified geometry–segmentation latent space, and UniPart, a two-stage latent diffusion framework that first generates a global Geom-Seg VecSet latent and part-latent masks, then performs dual-space part diffusion conditioned on both global and canonical coordinates. The method achieves superior segmentation controllability and high-fidelity part geometry, outperforming state-of-the-art baselines on both quantitative metrics (CD, F-scores, mIoU) and qualitative evaluations. By enabling end-to-end, latent-space part decomposition without external segmenters, UniPart advances editable, decomposable 3D content creation with practical implications for robotics, design, and visualization.

Abstract

Part-level 3D generation is essential for applications requiring decomposable and structured 3D synthesis. However, existing methods either rely on implicit part segmentation with limited granularity control or depend on strong external segmenters trained on large annotated datasets. In this work, we observe that part awareness emerges naturally during whole-object geometry learning and propose Geom-Seg VecSet, a unified geometry-segmentation latent representation that jointly encodes object geometry and part-level structure. Building on this representation, we introduce UniPart, a two-stage latent diffusion framework for image-guided part-level 3D generation. The first stage performs joint geometry generation and latent part segmentation, while the second stage conditions part-level diffusion on both whole-object and part-specific latents. A dual-space generation scheme further enhances geometric fidelity by predicting part latents in both global and canonical spaces. Extensive experiments demonstrate that UniPart achieves superior segmentation controllability and part-level geometric quality compared with existing approaches.

UniPart: Part-Level 3D Generation with Unified 3D Geom-Seg Latents

TL;DR

This work tackles the challenge of part-level 3D generation by learning part-aware structure directly from whole-object geometry. It introduces Geom-Seg VecSet, a unified geometry–segmentation latent space, and UniPart, a two-stage latent diffusion framework that first generates a global Geom-Seg VecSet latent and part-latent masks, then performs dual-space part diffusion conditioned on both global and canonical coordinates. The method achieves superior segmentation controllability and high-fidelity part geometry, outperforming state-of-the-art baselines on both quantitative metrics (CD, F-scores, mIoU) and qualitative evaluations. By enabling end-to-end, latent-space part decomposition without external segmenters, UniPart advances editable, decomposable 3D content creation with practical implications for robotics, design, and visualization.

Abstract

Part-level 3D generation is essential for applications requiring decomposable and structured 3D synthesis. However, existing methods either rely on implicit part segmentation with limited granularity control or depend on strong external segmenters trained on large annotated datasets. In this work, we observe that part awareness emerges naturally during whole-object geometry learning and propose Geom-Seg VecSet, a unified geometry-segmentation latent representation that jointly encodes object geometry and part-level structure. Building on this representation, we introduce UniPart, a two-stage latent diffusion framework for image-guided part-level 3D generation. The first stage performs joint geometry generation and latent part segmentation, while the second stage conditions part-level diffusion on both whole-object and part-specific latents. A dual-space generation scheme further enhances geometric fidelity by predicting part latents in both global and canonical spaces. Extensive experiments demonstrate that UniPart achieves superior segmentation controllability and part-level geometric quality compared with existing approaches.

Paper Structure

This paper contains 29 sections, 4 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: The pipeline of UniPart. It includes a Geom-Seg VAE that encodes both whole geometry and part segmentation information into a unified representation, Geom-Seg VecSet. The image-guided part-level generation adopts a two-level pipeline, where a whole-level DiT first generates the whole geometry and segmented part latent, and a part-level DiT then accepts the input image and the whole-part latent as conditions for dual-space part latent generation. The final object mesh is composed of each full-resolution part mesh.
  • Figure 2: Reconstruction results of our Geom-Seg VAE. (a) Input mesh with part segmentation masks; (b) Reconstructed geometry; (c) Reconstructed part latent segmentation visualization. (d) Reconstructed geometry of Hunyuan3D-2.1 hunyuan3d2025hunyuan3d21 for reference. Better zoom in for details.
  • Figure 3: Generated results of our whole-level DiT. (a) Input image; (b) Generated whole-object geometry; (c) Generated part latent segmentation. Better zoom in for details.
  • Figure 4: Qualitative results of part-level 3D generation for a given image. We visualize the "exploded" parts in the first line of each mesh pair for better visualization of part generation. Our UniPart can produce more reasonable part segmentations and higher-quality part geometries.
  • Figure 5: More generation results by our UniPart. (a) Input images; (b) The whole object geometry output by our whole-level DiT; (c) The composed part meshes output by our part-level DiT; (d) The part latent segmentation visualization produced by our whole-level DiT.
  • ...and 2 more figures