Table of Contents
Fetching ...

AniGaussian: Animatable Gaussian Avatar with Pose-guided Deformation

Mengtian Li, Shengxiang Yao, Chen Kai, Zhifeng Xie, Keyu Chen, Yu-Gang Jiang

TL;DR

AniGaussian addresses the challenge of reconstructing highly detailed, animatable human avatars from monocular video while maintaining pose-consistent geometry. It introduces a pose-guided deformation framework that couples non-rigid cloth movement with rigid body pose, using SMPL priors to guide local deformations and a rigid-based regularization to stabilize canonical Gaussians. A split-with-scale strategy enhances geometry expressiveness, and joint optimization of SMPL parameters improves alignment with observed data. Across PeopleSnapshot and ZJU-MoCap, AniGaussian achieves superior novel-view and novel-pose results, with faster training and real-time-like rendering, demonstrating practical potential for high-fidelity virtual avatars in visualization and VR/AR applications.

Abstract

Recent advancements in Gaussian-based human body reconstruction have achieved notable success in creating animatable avatars. However, there are ongoing challenges to fully exploit the SMPL model's prior knowledge and enhance the visual fidelity of these models to achieve more refined avatar reconstructions. In this paper, we introduce AniGaussian which addresses the above issues with two insights. First, we propose an innovative pose guided deformation strategy that effectively constrains the dynamic Gaussian avatar with SMPL pose guidance, ensuring that the reconstructed model not only captures the detailed surface nuances but also maintains anatomical correctness across a wide range of motions. Second, we tackle the expressiveness limitations of Gaussian models in representing dynamic human bodies. We incorporate rigid-based priors from previous works to enhance the dynamic transform capabilities of the Gaussian model. Furthermore, we introduce a split-with-scale strategy that significantly improves geometry quality. The ablative study experiment demonstrates the effectiveness of our innovative model design. Through extensive comparisons with existing methods, AniGaussian demonstrates superior performance in both qualitative result and quantitative metrics.

AniGaussian: Animatable Gaussian Avatar with Pose-guided Deformation

TL;DR

AniGaussian addresses the challenge of reconstructing highly detailed, animatable human avatars from monocular video while maintaining pose-consistent geometry. It introduces a pose-guided deformation framework that couples non-rigid cloth movement with rigid body pose, using SMPL priors to guide local deformations and a rigid-based regularization to stabilize canonical Gaussians. A split-with-scale strategy enhances geometry expressiveness, and joint optimization of SMPL parameters improves alignment with observed data. Across PeopleSnapshot and ZJU-MoCap, AniGaussian achieves superior novel-view and novel-pose results, with faster training and real-time-like rendering, demonstrating practical potential for high-fidelity virtual avatars in visualization and VR/AR applications.

Abstract

Recent advancements in Gaussian-based human body reconstruction have achieved notable success in creating animatable avatars. However, there are ongoing challenges to fully exploit the SMPL model's prior knowledge and enhance the visual fidelity of these models to achieve more refined avatar reconstructions. In this paper, we introduce AniGaussian which addresses the above issues with two insights. First, we propose an innovative pose guided deformation strategy that effectively constrains the dynamic Gaussian avatar with SMPL pose guidance, ensuring that the reconstructed model not only captures the detailed surface nuances but also maintains anatomical correctness across a wide range of motions. Second, we tackle the expressiveness limitations of Gaussian models in representing dynamic human bodies. We incorporate rigid-based priors from previous works to enhance the dynamic transform capabilities of the Gaussian model. Furthermore, we introduce a split-with-scale strategy that significantly improves geometry quality. The ablative study experiment demonstrates the effectiveness of our innovative model design. Through extensive comparisons with existing methods, AniGaussian demonstrates superior performance in both qualitative result and quantitative metrics.

Paper Structure

This paper contains 13 sections, 9 equations, 14 figures, 3 tables.

Figures (14)

  • Figure 1: AniGaussian takes monocular RGB video as input, reconstructing an animatable avatar model in around 30 minutes and rendering with 45 FPS on a single NVIDIA RTX 4090 GPU. The resulting human model can present subtitle texture and generate non-rigid deformation of clothes details. Performance in novel views and animation with unseen poses. Furthermore, we gain the highest reconstruction quality in current works which is evident in our picture metrics.
  • Figure 2: Overview of AniGaussian. At first, we initialize the point cloud using SMPL vertices. In the train processing, we find the nearest vertex as the deformation-guider of the Gaussian. We input the position of Gaussian after position encoding and the nearest vertex as the deformation code to the MLP to gain the non-rigid deformation. Then with the transformation of the SMPL vertex, the Gaussians are transformed to the pose space. In the tour of transformation, we use the rigid-based prior $L_{rot}$ and $L_{iso}$ to rule the deformation. After Gaussian splatting, we could refine the SMPL parameters and the canonical model.
  • Figure 3: Visual of the Rigid-based prior. With the deformation between the canonical space and the observation space, we hope the neighbour Gaussian could have a similar rotation and keep a property distance.
  • Figure 4: Qualitative comparison of novel view synthesis on PeopleSnapshot dataset. Compare to other methods, our method effectively restores details on the animatable avatar, including intricate details in the hair and folds in the clothes. These results underscore the applicability and robustness in real-world scenarios.
  • Figure 5: Novel pose synthesis on PeopleSnapshot alldieck2018video. Our method could drive the reconstruction animatable avatar in novel poses with fewer artifacts and present cloth details and render in 45FPS.
  • ...and 9 more figures