Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics

Seyma Yucer; Amir Atapour Abarghouei; Noura Al Moubayed; Toby P. Breckon

Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics

Seyma Yucer, Amir Atapour Abarghouei, Noura Al Moubayed, Toby P. Breckon

TL;DR

The paper tackles the problem of bias in automated facial analysis by enabling fine-grained, race-related phenotype control within 2D face images. It introduces a 2D-only framework that factorises the latent space into $k$ phenotype components using $I_C$ and $I_F$ datasets, encoders $E_F$ and $E_C$, a mapping network $E_{map}$, and a StyleGAN2 generator $G$, coupled with a two-stage training regime and one-shot fine-tuning. The approach achieves explicit control over skin and hair colour and facial feature shapes, improving photorealism (lower $FID$) and controllability relative to ConfigNet, while avoiding 3D data requirements. However, some identity-relevant shape attributes remain entangled and ethical considerations regarding data distribution and potential misuse are discussed, highlighting both the practical utility and the need for careful deployment in bias analysis contexts.

Abstract

Achieving an effective fine-grained appearance variation over 2D facial images, whilst preserving facial identity, is a challenging task due to the high complexity and entanglement of common 2D facial feature encoding spaces. Despite these challenges, such fine-grained control, by way of disentanglement is a crucial enabler for data-driven racial bias mitigation strategies across multiple automated facial analysis tasks, as it allows to analyse, characterise and synthesise human facial diversity. In this paper, we propose a novel GAN framework to enable fine-grained control over individual race-related phenotype attributes of the facial images. Our framework factors the latent (feature) space into elements that correspond to race-related facial phenotype representations, thereby separating phenotype aspects (e.g. skin, hair colour, nose, eye, mouth shapes), which are notoriously difficult to annotate robustly in real-world facial data. Concurrently, we also introduce a high quality augmented, diverse 2D face image dataset drawn from CelebA-HQ for GAN training. Unlike prior work, our framework only relies upon 2D imagery and related parameters to achieve state-of-the-art individual control over race-related phenotype attributes with improved photo-realistic output.

Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics

TL;DR

phenotype components using

and

datasets, encoders

and

, a mapping network

, and a StyleGAN2 generator

, coupled with a two-stage training regime and one-shot fine-tuning. The approach achieves explicit control over skin and hair colour and facial feature shapes, improving photorealism (lower

) and controllability relative to ConfigNet, while avoiding 3D data requirements. However, some identity-relevant shape attributes remain entangled and ethical considerations regarding data distribution and potential misuse are discussed, highlighting both the practical utility and the need for careful deployment in bias analysis contexts.

Abstract

Paper Structure (12 sections, 4 equations, 9 figures, 2 tables)

This paper contains 12 sections, 4 equations, 9 figures, 2 tables.

Introduction
Related Work
Methodology
Race-related Facial Phenotypes in Factorised Latent Space
Proposed Framework
Experimental Results
Datasets
Image Quality - Photorealism
Controllability
Discussion
Ethical Considerations
Conclusion

Figures (9)

Figure 1: Generated images with controlled race-related phenotypes by our proposed framework.
Figure 2: Metric-based parameters for race-related facial phenotypes: (a) Top column images are sourced from CelebA-HQ karras2017progressive, (b) Mask images provided by MaskGAN CelebAMask-HQ. (c) The facial skin area used for skin colour and (d) the hair area used for hair colour. (e-h) The specific face patch inputs applied for feature extraction.
Figure 3: The proposed framework employs two encoders $E_F$ and $E_C$ that encode face images $I_F$ and $I_{C}$ in latent space vectors $z_F$ and $z_C$, respectively. These vectors are further mapped into $w_F$ and $w_C$ using $E_{map}$, which are then fed into the shared decoder $G$ for image generation. A domain discriminator $D_{DA}$ ensures the similarity of latent distributions generated by $E_F$ and $E_C$.
Figure 4: The impact of one-shot learning through fine-tuning. (a) Original image. (b) Reconstructed image after second-stage training. (c) Reconstructed image after fine-tuning
Figure 5: Inference of our proposed framework.
...and 4 more figures

Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics

TL;DR

Abstract

Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics

Authors

TL;DR

Abstract

Table of Contents

Figures (9)