Table of Contents
Fetching ...

SemUV: Deep Learning based semantic manipulation over UV texture map of virtual human heads

Anirban Mukherjee, Venkat Suprabath Bitra, Vignesh Bondugula, Tarun Reddy Tallapureddy, Dinesh Babu Jayagopi

TL;DR

SemUV is introduced: a simple and effective approach using the FFHQ-UV dataset for semantic manipulation directly within the UV texture space that enables seamless integration into standard 3D graphics pipelines without demanding extensive domain expertise, time, or resources.

Abstract

Designing and manipulating virtual human heads is essential across various applications, including AR, VR, gaming, human-computer interaction and VFX. Traditional graphic-based approaches require manual effort and resources to achieve accurate representation of human heads. While modern deep learning techniques can generate and edit highly photorealistic images of faces, their focus remains predominantly on 2D facial images. This limitation makes them less suitable for 3D applications. Recognizing the vital role of editing within the UV texture space as a key component in the 3D graphics pipeline, our work focuses on this aspect to benefit graphic designers by providing enhanced control and precision in appearance manipulation. Research on existing methods within the UV texture space is limited, complex, and poses challenges. In this paper, we introduce SemUV: a simple and effective approach using the FFHQ-UV dataset for semantic manipulation directly within the UV texture space. We train a StyleGAN model on the publicly available FFHQ-UV dataset, and subsequently train a boundary for interpolation and semantic feature manipulation. Through experiments comparing our method with 2D manipulation technique, we demonstrate its superior ability to preserve identity while effectively modifying semantic features such as age, gender, and facial hair. Our approach is simple, agnostic to other 3D components such as structure, lighting, and rendering, and also enables seamless integration into standard 3D graphics pipelines without demanding extensive domain expertise, time, or resources.

SemUV: Deep Learning based semantic manipulation over UV texture map of virtual human heads

TL;DR

SemUV is introduced: a simple and effective approach using the FFHQ-UV dataset for semantic manipulation directly within the UV texture space that enables seamless integration into standard 3D graphics pipelines without demanding extensive domain expertise, time, or resources.

Abstract

Designing and manipulating virtual human heads is essential across various applications, including AR, VR, gaming, human-computer interaction and VFX. Traditional graphic-based approaches require manual effort and resources to achieve accurate representation of human heads. While modern deep learning techniques can generate and edit highly photorealistic images of faces, their focus remains predominantly on 2D facial images. This limitation makes them less suitable for 3D applications. Recognizing the vital role of editing within the UV texture space as a key component in the 3D graphics pipeline, our work focuses on this aspect to benefit graphic designers by providing enhanced control and precision in appearance manipulation. Research on existing methods within the UV texture space is limited, complex, and poses challenges. In this paper, we introduce SemUV: a simple and effective approach using the FFHQ-UV dataset for semantic manipulation directly within the UV texture space. We train a StyleGAN model on the publicly available FFHQ-UV dataset, and subsequently train a boundary for interpolation and semantic feature manipulation. Through experiments comparing our method with 2D manipulation technique, we demonstrate its superior ability to preserve identity while effectively modifying semantic features such as age, gender, and facial hair. Our approach is simple, agnostic to other 3D components such as structure, lighting, and rendering, and also enables seamless integration into standard 3D graphics pipelines without demanding extensive domain expertise, time, or resources.
Paper Structure (15 sections, 7 figures, 1 table)

This paper contains 15 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: SemUV vs Image based approach: Our approach works only on the UV space, focusing on the face texture, thus preventing unwanted changes possible in the image domain. Moreover, we can see the pathway for performing changes directly in UV space is shorter, thus making the approach simpler and faster.
  • Figure 2: SemUV Overview: In our approach 1. First we learn the distribution of the UV space using a generative model, in this case StyleGAN. 2. Then, using available labels, we train a linear classifier to learn the decision boundary for the semantic features. 3. Now, given a new UV image, we project it to the UV latent space, and interpolate it in the space across the learned boundary to perform disentangled semantic manipulation. 4. Finally, we take this new latent vector and use our trained GAN to generate the semantically modified UV map, which is wrapped onto the head mesh and rendered for the final output.
  • Figure 3: Outputs of SemUV: Images from left to right represent increasing age in the (top) UV-Texture map and (bottom) the final head mesh wrapped with the texture
  • Figure 4: FID and KID scores over 3000 epochs: Low values of FID and KID scores indicate high-quality of generated samples
  • Figure 5: Semantic manipulation results from (top) front, (middle) right and (bottom) left viewpoint. The changes in the features are made from left to right.
  • ...and 2 more figures