OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
Yunhan Yang, Yufan Zhou, Yuan-Chen Guo, Zi-Xin Zou, Yukun Huang, Ying-Tian Liu, Hao Xu, Ding Liang, Yan-Pei Cao, Xihui Liu
TL;DR
OmniPart tackles the challenge of creating editable, part-based 3D assets by decoupling structure planning from dense part synthesis. It introduces an autoregressive bounding-box planner guided by flexible 2D masks and a spatially-conditioned, rectified-flow latent generator that jointly synthesizes all parts within the planned layout, leveraging a TRELLIS-based structured latent space. The approach achieves state-of-the-art part-aware 3D generation with strong part-level control, coherence, and texture capability, enabling tasks like animation and material editing. This framework significantly enhances interpretability and editability of complex 3D content while maintaining high fidelity and efficiency.
Abstract
The creation of 3D assets with explicit, editable part structures is crucial for advancing interactive applications, yet most generative methods produce only monolithic shapes, limiting their utility. We introduce OmniPart, a novel framework for part-aware 3D object generation designed to achieve high semantic decoupling among components while maintaining robust structural cohesion. OmniPart uniquely decouples this complex task into two synergistic stages: (1) an autoregressive structure planning module generates a controllable, variable-length sequence of 3D part bounding boxes, critically guided by flexible 2D part masks that allow for intuitive control over part decomposition without requiring direct correspondences or semantic labels; and (2) a spatially-conditioned rectified flow model, efficiently adapted from a pre-trained holistic 3D generator, synthesizes all 3D parts simultaneously and consistently within the planned layout. Our approach supports user-defined part granularity, precise localization, and enables diverse downstream applications. Extensive experiments demonstrate that OmniPart achieves state-of-the-art performance, paving the way for more interpretable, editable, and versatile 3D content.
