Sampling Strategies for Mitigating Bias in Face Synthesis Methods
Emmanouil Maragkoudakis, Symeon Papadopoulos, Iraklis Varlamis, Christos Diou
TL;DR
This work addresses bias in GAN-based face synthesis, focusing on StyleGAN2 trained on FFHQ and two protected attributes: age and gender. It introduces two post-processing latent-space sampling strategies—Line sampling and Sphere sampling—that do not require retraining the generator to rebalance attribute distributions, and evaluates them using metrics $IR$, $ID$, and $LLI$ across GIQA-graded image quality $q$. Experiments show substantial bias in the baseline, which is reduced by both methods: Line sampling achieves stronger reductions in $IR$ (e.g., age and gender) while Sphere sampling improves diversity and high-quality attribute balance, albeit with slightly higher residual imbalance in some cases. The findings demonstrate practical bias mitigation paths for synthetic face data, with clear trade-offs between control, diversity, and computational overhead, and point to future work on automation and non-linear sampling strategies.
Abstract
Synthetically generated images can be used to create media content or to complement datasets for training image analysis models. Several methods have recently been proposed for the synthesis of high-fidelity face images; however, the potential biases introduced by such methods have not been sufficiently addressed. This paper examines the bias introduced by the widely popular StyleGAN2 generative model trained on the Flickr Faces HQ dataset and proposes two sampling strategies to balance the representation of selected attributes in the generated face images. We focus on two protected attributes, gender and age, and reveal that biases arise in the distribution of randomly sampled images against very young and very old age groups, as well as against female faces. These biases are also assessed for different image quality levels based on the GIQA score. To mitigate bias, we propose two alternative methods for sampling on selected lines or spheres of the latent space to increase the number of generated samples from the under-represented classes. The experimental results show a decrease in bias against underrepresented groups and a more uniform distribution of the protected features at different levels of image quality.
