Table of Contents
Fetching ...

A Taxonomy of the Biases of the Images created by Generative Artificial Intelligence

Adriana Fernández de Caleya Vázquez, Eduardo C. Garrido-Merchán

TL;DR

The paper addresses biases in image generation by proposing a comprehensive taxonomy spanning cultural, socioeconomic, biological, and demographic dimensions. It treats generative image models as conditional distributions $p(Y|X,theta)$ learned by minimizing $L$ on data $D$, linking observed biases to the parameter set and the data used for training. Key contributions include the taxonomy, mechanisms for bias emergence, and practical mitigation paths such as dataset augmentation and regularizers, complemented by test batteries and policy considerations. The work provides a actionable framework for developers and policymakers to reduce harmful stereotypes in AI imagery and to guide future research.

Abstract

Generative artificial intelligence models show an amazing performance creating unique content automatically just by being given a prompt by the user, which is revolutionizing several fields such as marketing and design. Not only are there models whose generated output belongs to the text format but we also find models that are able to automatically generate high quality genuine images and videos given a prompt. Although the performance in image creation seems impressive, it is necessary to slowly assess the content that these models are generating, as the users are uploading massively this material on the internet. Critically, it is important to remark that generative AI are statistical models whose parameter values are estimated given algorithms that maximize the likelihood of the parameters given an image dataset. Consequently, if the image dataset is biased towards certain values for vulnerable variables such as gender or skin color, we might find that the generated content of these models can be harmful for certain groups of people. By generating this content and being uploaded into the internet by users, these biases are perpetuating harmful stereotypes for vulnerable groups, polarizing social vision about, for example, what beauty or disability is and means. In this work, we analyze in detail how the generated content by these models can be strongly biased with respect to a plethora of variables, which we organize into a new image generative AI taxonomy. We also discuss the social, political and economical implications of these biases and possible ways to mitigate them.

A Taxonomy of the Biases of the Images created by Generative Artificial Intelligence

TL;DR

The paper addresses biases in image generation by proposing a comprehensive taxonomy spanning cultural, socioeconomic, biological, and demographic dimensions. It treats generative image models as conditional distributions learned by minimizing on data , linking observed biases to the parameter set and the data used for training. Key contributions include the taxonomy, mechanisms for bias emergence, and practical mitigation paths such as dataset augmentation and regularizers, complemented by test batteries and policy considerations. The work provides a actionable framework for developers and policymakers to reduce harmful stereotypes in AI imagery and to guide future research.

Abstract

Generative artificial intelligence models show an amazing performance creating unique content automatically just by being given a prompt by the user, which is revolutionizing several fields such as marketing and design. Not only are there models whose generated output belongs to the text format but we also find models that are able to automatically generate high quality genuine images and videos given a prompt. Although the performance in image creation seems impressive, it is necessary to slowly assess the content that these models are generating, as the users are uploading massively this material on the internet. Critically, it is important to remark that generative AI are statistical models whose parameter values are estimated given algorithms that maximize the likelihood of the parameters given an image dataset. Consequently, if the image dataset is biased towards certain values for vulnerable variables such as gender or skin color, we might find that the generated content of these models can be harmful for certain groups of people. By generating this content and being uploaded into the internet by users, these biases are perpetuating harmful stereotypes for vulnerable groups, polarizing social vision about, for example, what beauty or disability is and means. In this work, we analyze in detail how the generated content by these models can be strongly biased with respect to a plethora of variables, which we organize into a new image generative AI taxonomy. We also discuss the social, political and economical implications of these biases and possible ways to mitigate them.
Paper Structure (24 sections, 11 figures)

This paper contains 24 sections, 11 figures.

Figures (11)

  • Figure 1: Taxonomy of image generative artificial intelligence models biases
  • Figure 2: Naming bias example. The figure at the left is an example of image prompted with the name Laura, the figure at the right is an example of the image prompted with the name Rigoberta.
  • Figure 3: Body type bias. The women images generated by AI are all slim, wear elegant clothes and have white skin.
  • Figure 4: Facial feature bias. We can see similar patterns in the faces illustrated, sharing the same cultural beauty stereotype.
  • Figure 5: Hair bias. Long hair for women is always preferred.
  • ...and 6 more figures