HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset
Zedong Chu, Feng Xiong, Meiduo Liu, Jinzhi Zhang, Mingqi Shao, Zhaoxu Sun, Di Wang, Mu Xu
TL;DR
HumanRig addresses the lack of large-scale, standardized data for humanoid rigging by introducing 11,434 AI-generated T-pose models aligned to a uniform Mixamo skeleton and a data-driven automatic rigging pipeline. The framework combines a Prior-Guided Skeleton Estimator to initialize coarse 3D joints, a U-shaped Point Transformer to encode mesh features, and a Mesh-Skeleton Mutual Attention Network to jointly optimize skeleton construction and skinning. Key contributions include the PGSE, the MSCAN-based fusion of skeleton and mesh features, and the performance gains over state-of-the-art methods on both AI-generated and artist-created meshes. The work promises more efficient, automated rigging workflows and advances in animation pipelines, while noting limitations such as finer finger anatomy and potential extension to quadrupeds.
Abstract
With the rapid evolution of 3D generation algorithms, the cost of producing 3D humanoid character models has plummeted, yet the field is impeded by the lack of a comprehensive dataset for automatic rigging, which is a pivotal step in character animation. Addressing this gap, we present HumanRig, the first large-scale dataset specifically designed for 3D humanoid character rigging, encompassing 11,434 meticulously curated T-posed meshes adhered to a uniform skeleton topology. Capitalizing on this dataset, we introduce an innovative, data-driven automatic rigging framework, which overcomes the limitations of GNN-based methods in handling complex AI-generated meshes. Our approach integrates a Prior-Guided Skeleton Estimator (PGSE) module, which uses 2D skeleton joints to provide a preliminary 3D skeleton, and a Mesh-Skeleton Mutual Attention Network (MSMAN) that fuses skeleton features with 3D mesh features extracted by a U-shaped point transformer. This enables a coarse-to-fine 3D skeleton joint regression and a robust skinning estimation, surpassing previous methods in quality and versatility. This work not only remedies the dataset deficiency in rigging research but also propels the animation industry towards more efficient and automated character rigging pipelines.
