Table of Contents
Fetching ...

Toward Fairer Face Recognition Datasets

Alexandre Fournier-Montgieux, Michael Soumm, Adrian Popescu, Bertrand Luvison, Hervé Le Borgne

TL;DR

This work proposes a comprehensive evaluation that considers accuracy and fairness equally and includes a rigorous regression-based statistical analysis of attributes and shows that balancing reduces demographic unfairness.

Abstract

Face recognition and verification are two computer vision tasks whose performance has progressed with the introduction of deep representations. However, ethical, legal, and technical challenges due to the sensitive character of face data and biases in real training datasets hinder their development. Generative AI addresses privacy by creating fictitious identities, but fairness problems persist. We promote fairness by introducing a demographic attributes balancing mechanism in generated training datasets. We experiment with an existing real dataset, three generated training datasets, and the balanced versions of a diffusion-based dataset. We propose a comprehensive evaluation that considers accuracy and fairness equally and includes a rigorous regression-based statistical analysis of attributes. The analysis shows that balancing reduces demographic unfairness. Also, a performance gap persists despite generation becoming more accurate with time. The proposed balancing method and comprehensive verification evaluation promote fairer and transparent face recognition and verification.

Toward Fairer Face Recognition Datasets

TL;DR

This work proposes a comprehensive evaluation that considers accuracy and fairness equally and includes a rigorous regression-based statistical analysis of attributes and shows that balancing reduces demographic unfairness.

Abstract

Face recognition and verification are two computer vision tasks whose performance has progressed with the introduction of deep representations. However, ethical, legal, and technical challenges due to the sensitive character of face data and biases in real training datasets hinder their development. Generative AI addresses privacy by creating fictitious identities, but fairness problems persist. We promote fairness by introducing a demographic attributes balancing mechanism in generated training datasets. We experiment with an existing real dataset, three generated training datasets, and the balanced versions of a diffusion-based dataset. We propose a comprehensive evaluation that considers accuracy and fairness equally and includes a rigorous regression-based statistical analysis of attributes. The analysis shows that balancing reduces demographic unfairness. Also, a performance gap persists despite generation becoming more accurate with time. The proposed balancing method and comprehensive verification evaluation promote fairer and transparent face recognition and verification.

Paper Structure

This paper contains 21 sections, 4 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Fairness of the three training datasets in terms of equal opportunityhardt2016equality_opportunity when considering the advantaged outcome as being person differenciation: for each combination $(i,j)$ of an attribute, we consider the difference in opportunity between the demographics $i$ and $j$. Values close to 0 indicate an equality of outcome. Considered demographics : White, Black, Asian, Indian; Young, Adult, Senior; Male, Female
  • Figure 1: UMAPs mcinnes2020umap applied on $DCFace$, $DCFace_{all}$ and CASIA
  • Figure 2: Proposed attribute control method applied to DCFace kim2023dcface incorporating attribute balancing. Ethnicity and gender are controlled when generating the ID image. Age and pose are controlled when mixing images to obtain the visual representation of each identity. The figure is a modification of the original one from kim2023dcface.
  • Figure 2: Distance distribution to the cluster mean for FAVCI2D images. One image per identity is considered within each clustered
  • Figure 3: Accuracy imbalance for FAVCI2D
  • ...and 5 more figures