Table of Contents
Fetching ...

Less can be more: representational vs. stereotypical gender bias in facial expression recognition

Iris Dominguez-Catena, Daniel Paternain, Aranzazu Jurio, Mikel Galar

TL;DR

The results show that representational bias has a weaker impact than expected, and the need for a bias analysis that differentiates between types of bias is crucial for the development of effective bias mitigation strategies.

Abstract

Machine learning models can inherit biases from their training data, leading to discriminatory or inaccurate predictions. This is particularly concerning with the increasing use of large, unsupervised datasets for training foundational models. Traditionally, demographic biases within these datasets have not been well-understood, limiting our ability to understand how they propagate to the models themselves. To address this issue, this paper investigates the propagation of demographic biases from datasets into machine learning models. We focus on the gender demographic component, analyzing two types of bias: representational and stereotypical. For our analysis, we consider the domain of facial expression recognition (FER), a field known to exhibit biases in most popular datasets. We use Affectnet, one of the largest FER datasets, as our baseline for carefully designing and generating subsets that incorporate varying strengths of both representational and stereotypical bias. Subsequently, we train several models on these biased subsets, evaluating their performance on a common test set to assess the propagation of bias into the models' predictions. Our results show that representational bias has a weaker impact than expected. Models exhibit a good generalization ability even in the absence of one gender in the training dataset. Conversely, stereotypical bias has a significantly stronger impact, primarily concentrated on the biased class, although it can also influence predictions for unbiased classes. These results highlight the need for a bias analysis that differentiates between types of bias, which is crucial for the development of effective bias mitigation strategies.

Less can be more: representational vs. stereotypical gender bias in facial expression recognition

TL;DR

The results show that representational bias has a weaker impact than expected, and the need for a bias analysis that differentiates between types of bias is crucial for the development of effective bias mitigation strategies.

Abstract

Machine learning models can inherit biases from their training data, leading to discriminatory or inaccurate predictions. This is particularly concerning with the increasing use of large, unsupervised datasets for training foundational models. Traditionally, demographic biases within these datasets have not been well-understood, limiting our ability to understand how they propagate to the models themselves. To address this issue, this paper investigates the propagation of demographic biases from datasets into machine learning models. We focus on the gender demographic component, analyzing two types of bias: representational and stereotypical. For our analysis, we consider the domain of facial expression recognition (FER), a field known to exhibit biases in most popular datasets. We use Affectnet, one of the largest FER datasets, as our baseline for carefully designing and generating subsets that incorporate varying strengths of both representational and stereotypical bias. Subsequently, we train several models on these biased subsets, evaluating their performance on a common test set to assess the propagation of bias into the models' predictions. Our results show that representational bias has a weaker impact than expected. Models exhibit a good generalization ability even in the absence of one gender in the training dataset. Conversely, stereotypical bias has a significantly stronger impact, primarily concentrated on the biased class, although it can also influence predictions for unbiased classes. These results highlight the need for a bias analysis that differentiates between types of bias, which is crucial for the development of effective bias mitigation strategies.

Paper Structure

This paper contains 18 sections, 4 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Summary of the experimental methodology.
  • Figure 2: Dataset bias as measured with $\overline{\text{ENS}}$ (representational bias, left) and Cramer's $\text{V}$ (stereotypical bias, right), for datasets with different degrees of induced bias. In both plots, the horizontal red lines represent the bias metrics of the original Affectnet dataset.
  • Figure 3: (a) Recall difference (female recall minus male recall) for the representationally biased datasets. (b) Recall per class for the female group. (c) Recall per class for the male group. For all three, in the horizontal axis, amount of induced bias (bias factor $f$).
  • Figure 4: (a) Recall difference (female recall minus male recall) for each stereotypically biased dataset with biased class angry. (b) Recall per class for the female group. (c) Recall per class for the male group. For all three, in the horizontal axis, amount of induced bias (bias factor $f$)..
  • Figure 5: Class recall differences (female to male) across six sets of stereotypically biased datasets, targeting disgust, fear, happy, neutral, sad and surprise classes. In all subplots, the horizontal axis represents the amount of induced bias (bias factor $f$).
  • ...and 1 more figures