Table of Contents
Fetching ...

Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters

Zechen Bai, Peng Chen, Xiaolan Peng, Lu Liu, Hui Chen, Mike Zheng Shou, Feng Tian

TL;DR

A deep learning model was first trained to retarget the facial expression from input face images to virtual human faces by estimating the blendshape coefficients and a practical toolkit was developed using Unity 3D, making it compatible with the most popular VR applications.

Abstract

Animating virtual characters has always been a fundamental research problem in virtual reality (VR). Facial animations play a crucial role as they effectively convey emotions and attitudes of virtual humans. However, creating such facial animations can be challenging, as current methods often involve utilization of expensive motion capture devices or significant investments of time and effort from human animators in tuning animation parameters. In this paper, we propose a holistic solution to automatically animate virtual human faces. In our solution, a deep learning model was first trained to retarget the facial expression from input face images to virtual human faces by estimating the blendshape coefficients. This method offers the flexibility of generating animations with characters of different appearances and blendshape topologies. Second, a practical toolkit was developed using Unity 3D, making it compatible with the most popular VR applications. The toolkit accepts both image and video as input to animate the target virtual human faces and enables users to manipulate the animation results. Furthermore, inspired by the spirit of Human-in-the-loop (HITL), we leveraged user feedback to further improve the performance of the model and toolkit, thereby increasing the customization properties to suit user preferences. The whole solution, for which we will make the code public, has the potential to accelerate the generation of facial animations for use in VR applications.

Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters

TL;DR

A deep learning model was first trained to retarget the facial expression from input face images to virtual human faces by estimating the blendshape coefficients and a practical toolkit was developed using Unity 3D, making it compatible with the most popular VR applications.

Abstract

Animating virtual characters has always been a fundamental research problem in virtual reality (VR). Facial animations play a crucial role as they effectively convey emotions and attitudes of virtual humans. However, creating such facial animations can be challenging, as current methods often involve utilization of expensive motion capture devices or significant investments of time and effort from human animators in tuning animation parameters. In this paper, we propose a holistic solution to automatically animate virtual human faces. In our solution, a deep learning model was first trained to retarget the facial expression from input face images to virtual human faces by estimating the blendshape coefficients. This method offers the flexibility of generating animations with characters of different appearances and blendshape topologies. Second, a practical toolkit was developed using Unity 3D, making it compatible with the most popular VR applications. The toolkit accepts both image and video as input to animate the target virtual human faces and enables users to manipulate the animation results. Furthermore, inspired by the spirit of Human-in-the-loop (HITL), we leveraged user feedback to further improve the performance of the model and toolkit, thereby increasing the customization properties to suit user preferences. The whole solution, for which we will make the code public, has the potential to accelerate the generation of facial animations for use in VR applications.
Paper Structure (40 sections, 10 equations, 6 figures, 3 tables)

This paper contains 40 sections, 10 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Given a target facial video as reference, bring your own character into our solution integrated with Unity3D, and it automatically generates facial animation for the virtual character.
  • Figure 2: Illustration of training and usage of the model. The first stage trains the base model that regresses 3DMM facial parameters. The second stage trains the adapter model to estimate the target blendshape coefficients with the help of pre-trained base model. The usage stage utilizes the pre-trained base model and adapter model to drive virtual human to replicate facial expressions and head pose of the reference image.
  • Figure 3: The left part is the user interface of the proposed toolkit. The right part is the two modes of Human-in-the-loop (HITL). In the online mode, users can adjust blendshape coefficients of several key-frames and then apply the preference to the whole video. In the offline mode, the adjusted blendshape will be collected as ground-truth data to finetune the adapter model offline, further boosting the performance of the model.
  • Figure 4: Upper part: facial animation examples of one specific virtual character. Lower part: facial animation examples of virtual characters with different texture appearances but the same blendshape topology. Results generated by the deep learning model fully automatically, without head pose, without human-in-the-loop intervention.
  • Figure 5: Facial animation examples of virtual characters with the same texture appearance but different blendshape topologies. The models are equipped with 25, 66, and 113 blendshapes, respectively. The results are generated by the deep learning model fully automatically, without head pose, without human-in-the-loop intervention.
  • ...and 1 more figures