Table of Contents
Fetching ...

Sampling Strategies for Mitigating Bias in Face Synthesis Methods

Emmanouil Maragkoudakis, Symeon Papadopoulos, Iraklis Varlamis, Christos Diou

TL;DR

This work addresses bias in GAN-based face synthesis, focusing on StyleGAN2 trained on FFHQ and two protected attributes: age and gender. It introduces two post-processing latent-space sampling strategies—Line sampling and Sphere sampling—that do not require retraining the generator to rebalance attribute distributions, and evaluates them using metrics $IR$, $ID$, and $LLI$ across GIQA-graded image quality $q$. Experiments show substantial bias in the baseline, which is reduced by both methods: Line sampling achieves stronger reductions in $IR$ (e.g., age and gender) while Sphere sampling improves diversity and high-quality attribute balance, albeit with slightly higher residual imbalance in some cases. The findings demonstrate practical bias mitigation paths for synthetic face data, with clear trade-offs between control, diversity, and computational overhead, and point to future work on automation and non-linear sampling strategies.

Abstract

Synthetically generated images can be used to create media content or to complement datasets for training image analysis models. Several methods have recently been proposed for the synthesis of high-fidelity face images; however, the potential biases introduced by such methods have not been sufficiently addressed. This paper examines the bias introduced by the widely popular StyleGAN2 generative model trained on the Flickr Faces HQ dataset and proposes two sampling strategies to balance the representation of selected attributes in the generated face images. We focus on two protected attributes, gender and age, and reveal that biases arise in the distribution of randomly sampled images against very young and very old age groups, as well as against female faces. These biases are also assessed for different image quality levels based on the GIQA score. To mitigate bias, we propose two alternative methods for sampling on selected lines or spheres of the latent space to increase the number of generated samples from the under-represented classes. The experimental results show a decrease in bias against underrepresented groups and a more uniform distribution of the protected features at different levels of image quality.

Sampling Strategies for Mitigating Bias in Face Synthesis Methods

TL;DR

This work addresses bias in GAN-based face synthesis, focusing on StyleGAN2 trained on FFHQ and two protected attributes: age and gender. It introduces two post-processing latent-space sampling strategies—Line sampling and Sphere sampling—that do not require retraining the generator to rebalance attribute distributions, and evaluates them using metrics , , and across GIQA-graded image quality . Experiments show substantial bias in the baseline, which is reduced by both methods: Line sampling achieves stronger reductions in (e.g., age and gender) while Sphere sampling improves diversity and high-quality attribute balance, albeit with slightly higher residual imbalance in some cases. The findings demonstrate practical bias mitigation paths for synthetic face data, with clear trade-offs between control, diversity, and computational overhead, and point to future work on automation and non-linear sampling strategies.

Abstract

Synthetically generated images can be used to create media content or to complement datasets for training image analysis models. Several methods have recently been proposed for the synthesis of high-fidelity face images; however, the potential biases introduced by such methods have not been sufficiently addressed. This paper examines the bias introduced by the widely popular StyleGAN2 generative model trained on the Flickr Faces HQ dataset and proposes two sampling strategies to balance the representation of selected attributes in the generated face images. We focus on two protected attributes, gender and age, and reveal that biases arise in the distribution of randomly sampled images against very young and very old age groups, as well as against female faces. These biases are also assessed for different image quality levels based on the GIQA score. To mitigate bias, we propose two alternative methods for sampling on selected lines or spheres of the latent space to increase the number of generated samples from the under-represented classes. The experimental results show a decrease in bias against underrepresented groups and a more uniform distribution of the protected features at different levels of image quality.
Paper Structure (11 sections, 7 equations, 5 figures, 1 table)

This paper contains 11 sections, 7 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Face images generated by a pre-trained StyleGAN2 model. The left figure (a) illustrates the highest quality images in a dataset of 5000 images, as measured by the GIQA score. The right figure (b) shows the worst quality images of the same dataset. There is a pronounced gender and age bias in the two image quality groups as we can see that the majority of the top 50 images are faces of white adult men while the majority of the bottom 50 are images of children of color.
  • Figure 2: Moving along a straight line of the latent space. The method has selected two images with the same age and gender attributes to perform the linear sampling. In this scenario we are looking for more images of middle aged men to balance out a dataset that contains more images of women in the corresponding age group.
  • Figure 3: Moving in a sphere around a selected point in the latent space.
  • Figure 4: Plot representation of the distribution of features in the initial dataset.
  • Figure 5: Two examples of the line sampling strategy