Table of Contents
Fetching ...

DSAP: Analyzing Bias Through Demographic Comparison of Datasets

Iris Dominguez-Catena, Daniel Paternain, Mikel Galar

TL;DR

This work proposes DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of two datasets, and considers the Facial Expression Recognition task, where demographic bias has previously been found.

Abstract

In the last few years, Artificial Intelligence systems have become increasingly widespread. Unfortunately, these systems can share many biases with human decision-making, including demographic biases. Often, these biases can be traced back to the data used for training, where large uncurated datasets have become the norm. Despite our knowledge of these biases, we still lack general tools to detect and quantify them, as well as to compare the biases in different datasets. Thus, in this work, we propose DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of two datasets. DSAP can be deployed in three key applications: to detect and characterize demographic blind spots and bias issues across datasets, to measure dataset demographic bias in single datasets, and to measure dataset demographic shift in deployment scenarios. An essential feature of DSAP is its ability to robustly analyze datasets without explicit demographic labels, offering simplicity and interpretability for a wide range of situations. To show the usefulness of the proposed methodology, we consider the Facial Expression Recognition task, where demographic bias has previously been found. The three applications are studied over a set of twenty datasets with varying properties. The code is available at https://github.com/irisdominguez/DSAP.

DSAP: Analyzing Bias Through Demographic Comparison of Datasets

TL;DR

This work proposes DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of two datasets, and considers the Facial Expression Recognition task, where demographic bias has previously been found.

Abstract

In the last few years, Artificial Intelligence systems have become increasingly widespread. Unfortunately, these systems can share many biases with human decision-making, including demographic biases. Often, these biases can be traced back to the data used for training, where large uncurated datasets have become the norm. Despite our knowledge of these biases, we still lack general tools to detect and quantify them, as well as to compare the biases in different datasets. Thus, in this work, we propose DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of two datasets. DSAP can be deployed in three key applications: to detect and characterize demographic blind spots and bias issues across datasets, to measure dataset demographic bias in single datasets, and to measure dataset demographic shift in deployment scenarios. An essential feature of DSAP is its ability to robustly analyze datasets without explicit demographic labels, offering simplicity and interpretability for a wide range of situations. To show the usefulness of the proposed methodology, we consider the Facial Expression Recognition task, where demographic bias has previously been found. The three applications are studied over a set of twenty datasets with varying properties. The code is available at https://github.com/irisdominguez/DSAP.
Paper Structure (27 sections, 12 equations, 11 figures, 1 table)

This paper contains 27 sections, 12 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: DSAP summary. The two steps illustrated at the top of the figure can then be applied for the three applications in the lower part.
  • Figure 2: Demographic axis profiles of each of the datasets, as calculated from the FairFace model predictions. For each of the demographic axes, namely age, gender, and race, the proportion of subjects in each group is shown. In datasets where the identity of the subject is known, the demographic attributes are predicted for each sample and the mode of the sample predictions are considered for each subject.
  • Figure 3: Demographic similarity comparison between the datasets in the age axis, and subsequent dataset clustering. The left-hand side dendogram is obtained from a complete linkage according to the similarity scores. The first column of the matrix shows a potential clusterization, obtained with a maximum cophenetic distance of $0.6$.
  • Figure 4: Demographic similarity comparison between the datasets in the gender axis, and subsequent dataset clustering. The left-hand side dendogram is obtained from a complete linkage according to the similarity scores. The first column of the matrix shows a potential clusterization, obtained with a maximum cophenetic distance of $0.6$.
  • Figure 5: Demographic similarity comparison between the datasets in the race axis, and subsequent dataset clustering. The left-hand side dendogram is obtained from a complete linkage according to the similarity scores. The first column of the matrix shows a potential clusterization, obtained with a maximum cophenetic distance of $0.6$.
  • ...and 6 more figures