Table of Contents
Fetching ...

LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example

Soyeon Yoon, Kwan Yun, Kwanggyoon Seo, Sihun Cha, Jung Eun Yoo, Junyong Noh

TL;DR

A set of quantitative and qualitative evaluations demonstrate that the proposed method can produce highly stylized face meshes according to a given style and output them in a desired topology.

Abstract

Recent advances in 3D face stylization have made significant strides in few to zero-shot settings. However, the degree of stylization achieved by existing methods is often not sufficient for practical applications because they are mostly based on statistical 3D Morphable Models (3DMM) with limited variations. To this end, we propose a method that can produce a highly stylized 3D face model with desired topology. Our methods train a surface deformation network with 3DMM and translate its domain to the target style using a paired exemplar. The network achieves stylization of the 3D face mesh by mimicking the style of the target using a differentiable renderer and directional CLIP losses. Additionally, during the inference process, we utilize a Mesh Agnostic Encoder (MAGE) that takes deformation target, a mesh of diverse topologies as input to the stylization process and encodes its shape into our latent space. The resulting stylized face model can be animated by commonly used 3DMM blend shapes. A set of quantitative and qualitative evaluations demonstrate that our method can produce highly stylized face meshes according to a given style and output them in a desired topology. We also demonstrate example applications of our method including image-based stylized avatar generation, linear interpolation of geometric styles, and facial animation of stylized avatars.

LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example

TL;DR

A set of quantitative and qualitative evaluations demonstrate that the proposed method can produce highly stylized face meshes according to a given style and output them in a desired topology.

Abstract

Recent advances in 3D face stylization have made significant strides in few to zero-shot settings. However, the degree of stylization achieved by existing methods is often not sufficient for practical applications because they are mostly based on statistical 3D Morphable Models (3DMM) with limited variations. To this end, we propose a method that can produce a highly stylized 3D face model with desired topology. Our methods train a surface deformation network with 3DMM and translate its domain to the target style using a paired exemplar. The network achieves stylization of the 3D face mesh by mimicking the style of the target using a differentiable renderer and directional CLIP losses. Additionally, during the inference process, we utilize a Mesh Agnostic Encoder (MAGE) that takes deformation target, a mesh of diverse topologies as input to the stylization process and encodes its shape into our latent space. The resulting stylized face model can be animated by commonly used 3DMM blend shapes. A set of quantitative and qualitative evaluations demonstrate that our method can produce highly stylized face meshes according to a given style and output them in a desired topology. We also demonstrate example applications of our method including image-based stylized avatar generation, linear interpolation of geometric styles, and facial animation of stylized avatars.
Paper Structure (31 sections, 10 equations, 10 figures, 3 tables)

This paper contains 31 sections, 10 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: (a) The proposed method demonstrates robustness to unseen face identities and topologies and effectively generates stylized output faces with desired topologies. (b) Our stylized avatars can be animated using 3DMM blend shapes.
  • Figure 2: Comparison of different stylized 3D face generation methods and their limitations in meeting key elements. 3D-aware methods cannot generate 3D face in desired topologies. 3DMM-based methods have a limited stylization capability. Text-based deformation models are not directly animatable. The proposed method meets the goal of all three components.
  • Figure 3: Overview of our method: The upper box illustrates the inference stage, where our method takes diverse deformation targets and generates stylized outputs. In the lower-left box, the training process of Mesh Agnostic Encoder (MAGE) is depicted. In the lower-right box, the fine-tuning process of $D_T$ is illustrated.
  • Figure 4: Demonstration of stylized 3D faces with desired topology, regardless of deformation target variations.
  • Figure 5: Stylization results across diverse styles and identities. Our approach generates varied styles while preserving the deformation target identity and generalizing to diverse geometric representations like masks and point clouds.
  • ...and 5 more figures