PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion
Yichen Yang, Hong Li, Haodong Zhu, Linin Yang, Guojun Lei, Sheng Xu, Baochang Zhang
TL;DR
PartDiffuser introduces a semi-autoregressive diffusion framework for 3D mesh generation that decouples global topology and local geometry by performing autoregression between semantic parts and parallel diffusion within each part. The method uses hierarchical geometric conditioning from a point cloud and a Part-Aware Diffusion Block to dynamically guide generation, enabling high-fidelity local details while preserving correct global structure. Empirical results show significant improvements over state-of-the-art baselines, especially on complex datasets like Objaverse, with ablations confirming the value of combined global and part-specific conditioning. The work demonstrates strong practical potential for producing artist-level 3D meshes suitable for real-world applications and provides a dataset construction and efficiency analysis to support future research.
Abstract
Existing autoregressive (AR) methods for generating artist-designed meshes struggle to balance global structural consistency with high-fidelity local details, and are susceptible to error accumulation. To address this, we propose PartDiffuser, a novel semi-autoregressive diffusion framework for point-cloud-to-mesh generation. The method first performs semantic segmentation on the mesh and then operates in a "part-wise" manner: it employs autoregression between parts to ensure global topology, while utilizing a parallel discrete diffusion process within each semantic part to precisely reconstruct high-frequency geometric features. PartDiffuser is based on the DiT architecture and introduces a part-aware cross-attention mechanism, using point clouds as hierarchical geometric conditioning to dynamically control the generation process, thereby effectively decoupling the global and local generation tasks. Experiments demonstrate that this method significantly outperforms state-of-the-art (SOTA) models in generating 3D meshes with rich detail, exhibiting exceptional detail representation suitable for real-world applications.
