Table of Contents
Fetching ...

SIG: A Synthetic Identity Generation Pipeline for Generating Evaluation Datasets for Face Recognition

Kassi Nzalasse, Rishav Raj, Eli Laird, Corey Clark

TL;DR

This work addresses the need for ethically sourced, balanced face-recognition evaluation data by introducing the Synthetic Identity Generation (SIG) pipeline, which leverages Stable Diffusion and ControlNets to produce high-quality, pose- and demographics-controlled synthetic identities. SIG produces ControlFace10k, a 10,008-image dataset across 3,336 synthetic identities with balanced race, gender, age, and pose attributes, released openly for research. Analyses compare ControlFace10k with real datasets using state-of-the-art models (ArcFace, GhostFaceNet) to reveal model-specific biases and the utility of synthetic data for bias assessment and evaluation. The authors highlight the privacy, scalability, and controlled-factor benefits of synthetic data and outline plans to scale SIG and use it for bias mitigation in face recognition systems.

Abstract

As Artificial Intelligence applications expand, the evaluation of models faces heightened scrutiny. Ensuring public readiness requires evaluation datasets, which differ from training data by being disjoint and ethically sourced in compliance with privacy regulations. The performance and fairness of face recognition systems depend significantly on the quality and representativeness of these evaluation datasets. This data is sometimes scraped from the internet without user's consent, causing ethical concerns that can prohibit its use without proper releases. In rare cases, data is collected in a controlled environment with consent, however, this process is time-consuming, expensive, and logistically difficult to execute. This creates a barrier for those unable to conjure the immense resources required to gather ethically sourced evaluation datasets. To address these challenges, we introduce the Synthetic Identity Generation pipeline, or SIG, that allows for the targeted creation of ethical, balanced datasets for face recognition evaluation. Our proposed and demonstrated pipeline generates high-quality images of synthetic identities with controllable pose, facial features, and demographic attributes, such as race, gender, and age. We also release an open-source evaluation dataset named ControlFace10k, consisting of 10,008 face images of 3,336 unique synthetic identities balanced across race, gender, and age, generated using the proposed SIG pipeline. We analyze ControlFace10k along with a non-synthetic BUPT dataset using state-of-the-art face recognition algorithms to demonstrate its effectiveness as an evaluation tool. This analysis highlights the dataset's characteristics and its utility in assessing algorithmic bias across different demographic groups.

SIG: A Synthetic Identity Generation Pipeline for Generating Evaluation Datasets for Face Recognition

TL;DR

This work addresses the need for ethically sourced, balanced face-recognition evaluation data by introducing the Synthetic Identity Generation (SIG) pipeline, which leverages Stable Diffusion and ControlNets to produce high-quality, pose- and demographics-controlled synthetic identities. SIG produces ControlFace10k, a 10,008-image dataset across 3,336 synthetic identities with balanced race, gender, age, and pose attributes, released openly for research. Analyses compare ControlFace10k with real datasets using state-of-the-art models (ArcFace, GhostFaceNet) to reveal model-specific biases and the utility of synthetic data for bias assessment and evaluation. The authors highlight the privacy, scalability, and controlled-factor benefits of synthetic data and outline plans to scale SIG and use it for bias mitigation in face recognition systems.

Abstract

As Artificial Intelligence applications expand, the evaluation of models faces heightened scrutiny. Ensuring public readiness requires evaluation datasets, which differ from training data by being disjoint and ethically sourced in compliance with privacy regulations. The performance and fairness of face recognition systems depend significantly on the quality and representativeness of these evaluation datasets. This data is sometimes scraped from the internet without user's consent, causing ethical concerns that can prohibit its use without proper releases. In rare cases, data is collected in a controlled environment with consent, however, this process is time-consuming, expensive, and logistically difficult to execute. This creates a barrier for those unable to conjure the immense resources required to gather ethically sourced evaluation datasets. To address these challenges, we introduce the Synthetic Identity Generation pipeline, or SIG, that allows for the targeted creation of ethical, balanced datasets for face recognition evaluation. Our proposed and demonstrated pipeline generates high-quality images of synthetic identities with controllable pose, facial features, and demographic attributes, such as race, gender, and age. We also release an open-source evaluation dataset named ControlFace10k, consisting of 10,008 face images of 3,336 unique synthetic identities balanced across race, gender, and age, generated using the proposed SIG pipeline. We analyze ControlFace10k along with a non-synthetic BUPT dataset using state-of-the-art face recognition algorithms to demonstrate its effectiveness as an evaluation tool. This analysis highlights the dataset's characteristics and its utility in assessing algorithmic bias across different demographic groups.
Paper Structure (19 sections, 6 figures, 2 tables)

This paper contains 19 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: (a) Example reference image featuring diverse poses, (b) Pose information extracted by the 'OpenPose' ControlNet
  • Figure 2: Face generated by Stable Diffusion 2.1 base utilizing the trained "pose" ControlNet, demonstrating the model's ability to capture one head orientation
  • Figure 3: Synthetic identities generated with SIG for each race in the ControlFace10k dataset. Each row displays images from left to right: male right-facing, male front-facing, male left-facing, female right-facing, female front-facing, female left-facing, for the races African, Asian, Caucasian, and Indian respectively. The generated identities look realistic, with no irregular textures.
  • Figure 4: Score distributions for non-mated pairs across different racial groups using ArcFace and GhostFaceNet models on BUPT and ControlFace10k datasets. Panels (a) African, (b) Asian, (c) Caucasian, (d) Indian illustrate that the ideal similarity score centered around 0.5 signifies orthogonal embeddings between different identities. The distributions typically show scores centering around 0.5, indicating effective model performance. Notably, the overlap between the synthetic data and the BUPT dataset demonstrates that the synthetic data's scores follow the behavior of non-synthetic data, which is the desired result. However, variances such as the higher scores for African identities by ArcFace compared to GhostFaceNet suggest potential biases in the algorithms. Such observations underscore the utility of synthetic data with race labels in assessing the fairness and accuracy of facial recognition technologies.
  • Figure 5: Facial similarity score distributions for synthetic identities using (A) ArcFace and (B) GhostFaceNet models. Each density curve represents similarity scores among three images of the same synthetic identity. Higher scores indicate greater similarity. The SIG Frontal Sample's sharp peak demonstrates high similarity when generating three frontal images per identity. In contrast, the broader distributions for BUPT (three different poses) and ControlFace10k (one frontal, one left, one right pose) datasets indicate lower similarity scores despite representing the same identity. This suggests face recognition models may be less consistent when comparing varied pose angles of the same individual.
  • ...and 1 more figures