AttriHuman-3D: Editable 3D Human Avatar Generation with Attribute Decomposition and Indexing
Fan Yang, Tianyi Chen, Xiaosheng He, Zhongang Cai, Lei Yang, Si Wu, Guosheng Lin
TL;DR
AttriHuman-3D tackles editable 3D-aware human avatar generation by introducing a space-attribute decomposition (six feature planes) and an implicit indexing mechanism to isolate attributes. A 4D space-attribute field is decomposed into six planes and paired with an implicit index predictor and orthogonal regularization to achieve strong disentanglement, enabling precise, attribute-level editing within a canonical space and SMPL-based deformation. The method additionally employs a hyper-latent training strategy and attribute-specific sampling to reduce style entanglement, resulting in high-quality view-consistent avatars and effective interactive editing. Experiments on fashion datasets show competitive rendering quality and clear advantages in editing fidelity and efficiency, demonstrating practical applicability for content creation, games, and AR/VR.
Abstract
Editable 3D-aware generation, which supports user-interacted editing, has witnessed rapid development recently. However, existing editable 3D GANs either fail to achieve high-accuracy local editing or suffer from huge computational costs. We propose AttriHuman-3D, an editable 3D human generation model, which address the aforementioned problems with attribute decomposition and indexing. The core idea of the proposed model is to generate all attributes (e.g. human body, hair, clothes and so on) in an overall attribute space with six feature planes, which are then decomposed and manipulated with different attribute indexes. To precisely extract features of different attributes from the generated feature planes, we propose a novel attribute indexing method as well as an orthogonal projection regularization to enhance the disentanglement. We also introduce a hyper-latent training strategy and an attribute-specific sampling strategy to avoid style entanglement and misleading punishment from the discriminator. Our method allows users to interactively edit selected attributes in the generated 3D human avatars while keeping others fixed. Both qualitative and quantitative experiments demonstrate that our model provides a strong disentanglement between different attributes, allows fine-grained image editing and generates high-quality 3D human avatars.
