Table of Contents
Fetching ...

Subjective Face Transform using Human First Impressions

Chaitanya Roygaga, Joshua Krinsky, Kai Zhang, Kenny Kwok, Aparna Bharati

TL;DR

The paper tackles how to model and manipulate subjective first impressions of faces without altering identity. It introduces a continuous normalizing flow–based framework operating in StyleGAN2 latent space to map latent codes to subjective attribute scores and perform identity-preserving edits along target impression axes, using GAN inversion and an identity regularizer. The method is trained on real and synthetic data and evaluated for in-domain and out-of-domain generalization, showing perceptually meaningful edits with preserved identity, and it demonstrates that synthetic data augmentation can improve first-impression prediction models across multiple datasets. The work also addresses biases and ethics in subjectivity of face perception, providing tools to study and mitigate biases while enabling applications in debiasing experiments and trait-focused data augmentation.

Abstract

Humans tend to form quick subjective first impressions of non-physical attributes when seeing someone's face, such as perceived trustworthiness or attractiveness. To understand what variations in a face lead to different subjective impressions, this work uses generative models to find semantically meaningful edits to a face image that change perceived attributes. Unlike prior work that relied on statistical manipulation in feature space, our end-to-end framework considers trade-offs between preserving identity and changing perceptual attributes. It maps latent space directions to changes in attribute scores, enabling a perceptually significant identity-preserving transformation of any input face along an attribute axis according to a target change. We train on real and synthetic faces, evaluate for in-domain and out-of-domain images using predictive models and human ratings, demonstrating the generalizability of our approach. Ultimately, such a framework can be used to understand and explain trends and biases in subjective interpretation of faces that are not dependent on the subject's identity. This is demonstrated with improved model performance for first impression prediction when augmenting the training data with images generated by the proposed approach for a wider range of input to learn associations between face features and subjective attributes.

Subjective Face Transform using Human First Impressions

TL;DR

The paper tackles how to model and manipulate subjective first impressions of faces without altering identity. It introduces a continuous normalizing flow–based framework operating in StyleGAN2 latent space to map latent codes to subjective attribute scores and perform identity-preserving edits along target impression axes, using GAN inversion and an identity regularizer. The method is trained on real and synthetic data and evaluated for in-domain and out-of-domain generalization, showing perceptually meaningful edits with preserved identity, and it demonstrates that synthetic data augmentation can improve first-impression prediction models across multiple datasets. The work also addresses biases and ethics in subjectivity of face perception, providing tools to study and mitigate biases while enabling applications in debiasing experiments and trait-focused data augmentation.

Abstract

Humans tend to form quick subjective first impressions of non-physical attributes when seeing someone's face, such as perceived trustworthiness or attractiveness. To understand what variations in a face lead to different subjective impressions, this work uses generative models to find semantically meaningful edits to a face image that change perceived attributes. Unlike prior work that relied on statistical manipulation in feature space, our end-to-end framework considers trade-offs between preserving identity and changing perceptual attributes. It maps latent space directions to changes in attribute scores, enabling a perceptually significant identity-preserving transformation of any input face along an attribute axis according to a target change. We train on real and synthetic faces, evaluate for in-domain and out-of-domain images using predictive models and human ratings, demonstrating the generalizability of our approach. Ultimately, such a framework can be used to understand and explain trends and biases in subjective interpretation of faces that are not dependent on the subject's identity. This is demonstrated with improved model performance for first impression prediction when augmenting the training data with images generated by the proposed approach for a wider range of input to learn associations between face features and subjective attributes.
Paper Structure (37 sections, 8 equations, 15 figures, 10 tables)

This paper contains 37 sections, 8 equations, 15 figures, 10 tables.

Figures (15)

  • Figure 1: We propose a method of exploring possible identity-disentangled semantic variations of a face image in a latent space that encodes subjective attributes. Upon learning the mapping between the latent representations of images and the corresponding subjective attribute scores, any image can be transformed based on a desired score change for a particular attribute -- with all possible score-based variations scaled between $[0,1]$. Original (Orig.) / Inversion (Inv.) shows the original image as reconstructed using inversion of the latent representation learned by the GAN karras2020training.
  • Figure 2: Proposed face transformation method. Original face image is given as input, and the first phase involves extracting image features (image latents) and the corresponding human-like score predictions, for the selected attribute. Then, to generate disentangled and continuous mapping of image latent features with the corresponding scores, image latents are continuously evolved according to predicted attribute scores. The editing phase uses the mapping to directly transform the input image latent $w$ according to the desired score change. Finally, inversion step reconstructs the transformed image from edited latent $w'$.
  • Figure 3: Predicted score distribution for evaluation image sets.
  • Figure 4: Qualitative results for training without ID loss (left). Training data image feature diversity affecting the transformations. We observe apparent differences in edits when comparing both sides, like decrease in age during the reduction in dominance perception or increase in trustworthiness perception, or even subtle changes like difference in hairstyles or the subject's eye color. Qualitative results with ID loss are shown on the right.
  • Figure 5: Unintended reflection of training data biases. Chubby face ($\downarrow$ Attractiveness) and darker skin tone ($\uparrow$ Dominance) reflect human annotation biases. Biases are further discussed in Section \ref{['section:limitations_biases']}.
  • ...and 10 more figures