Table of Contents
Fetching ...

DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters

Mingze Sun, Junhao Chen, Junting Dong, Yurun Chen, Xinyu Jiang, Shiwei Mao, Puhua Jiang, Jingbo Wang, Bo Dai, Ruqi Huang

TL;DR

This work proposes DRiVE, a novel framework for generating and rigging 3D human characters with intricate structures that utilizes a 3D Gaussian representation, facilitating efficient animation and high-quality rendering.

Abstract

Recent advances in generative models have enabled high-quality 3D character reconstruction from multi-modal. However, animating these generated characters remains a challenging task, especially for complex elements like garments and hair, due to the lack of large-scale datasets and effective rigging methods. To address this gap, we curate AnimeRig, a large-scale dataset with detailed skeleton and skinning annotations. Building upon this, we propose DRiVE, a novel framework for generating and rigging 3D human characters with intricate structures. Unlike existing methods, DRiVE utilizes a 3D Gaussian representation, facilitating efficient animation and high-quality rendering. We further introduce GSDiff, a 3D Gaussian-based diffusion module that predicts joint positions as spatial distributions, overcoming the limitations of regression-based approaches. Extensive experiments demonstrate that DRiVE achieves precise rigging results, enabling realistic dynamics for clothing and hair, and surpassing previous methods in both quality and versatility. The code and dataset will be made public for academic use upon acceptance.

DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters

TL;DR

This work proposes DRiVE, a novel framework for generating and rigging 3D human characters with intricate structures that utilizes a 3D Gaussian representation, facilitating efficient animation and high-quality rendering.

Abstract

Recent advances in generative models have enabled high-quality 3D character reconstruction from multi-modal. However, animating these generated characters remains a challenging task, especially for complex elements like garments and hair, due to the lack of large-scale datasets and effective rigging methods. To address this gap, we curate AnimeRig, a large-scale dataset with detailed skeleton and skinning annotations. Building upon this, we propose DRiVE, a novel framework for generating and rigging 3D human characters with intricate structures. Unlike existing methods, DRiVE utilizes a 3D Gaussian representation, facilitating efficient animation and high-quality rendering. We further introduce GSDiff, a 3D Gaussian-based diffusion module that predicts joint positions as spatial distributions, overcoming the limitations of regression-based approaches. Extensive experiments demonstrate that DRiVE achieves precise rigging results, enabling realistic dynamics for clothing and hair, and surpassing previous methods in both quality and versatility. The code and dataset will be made public for academic use upon acceptance.

Paper Structure

This paper contains 17 sections, 5 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: We compare our method with CharacterGen peng2024charactergen animation results. Since DRiVE explicitly models clothing and hair, it generates more natural and realistic animations. Additionally, 3D Gaussian-based rendering achieves higher quality than mesh.
  • Figure 2: We show the comparison results of LGM before and after fine-tuning on our dataset in (a). We present the Ground Truth skeleton and skinning results of the mesh and the results transferred to 3D Gaussian in (b).
  • Figure 3: The overall pipeline of our framework. See the main text for more details.
  • Figure 4: Pipeline for 3D Gaussian refinement and results using a T-pose anime image.
  • Figure 5: Our method accurately predicts skeletal structures, outperforming RigNet xu2020rignet in joint and bone estimation.
  • ...and 2 more figures