Table of Contents
Fetching ...

Erasing 'Ugly' from the Internet: Propagation of the Beauty Myth in Text-Image Models

Tanvi Dinkar, Aiqi Jiang, Gavin Abercrombie, Ioannis Konstas

TL;DR

The paper analyzes how beauty norms are encoded in two AI generation pipelines that span text-to-image and LLM-assisted image creation, using a novel beauty taxonomy to generate nearly 6,000 images and a 1,200-image human-evaluation subset. It finds pervasive biases toward lighter skin tones, younger appearances, and NSFW content, with negative prompting amplifying sexualization and non-binary targets showing especially young and sexualized traits; neutral prompts tend to yield more realistic outputs. The study highlights that safety filters alone cannot mitigate these biases, as data streams can become polluted with biased visuals and eroded diversity, raising societal concerns about lookism and data integrity. It contributes a rigorous, cross-modal analysis approach and calls for broader mitigation strategies that go beyond simple content filtering. The findings have practical implications for how generative AI should be trained, evaluated, and deployed in contexts affecting body image and representation.

Abstract

Social media has exacerbated the promotion of Western beauty norms, leading to negative self-image, particularly in women and girls, and causing harm such as body dysmorphia. Increasingly content on the internet has been artificially generated, leading to concerns that these norms are being exaggerated. The aim of this work is to study how generative AI models may encode 'beauty' and erase 'ugliness', and discuss the implications of this for society. To investigate these aims, we create two image generation pipelines: a text-to-image model and a text-to-language model-to image model. We develop a structured beauty taxonomy which we use to prompt three language models (LMs) and two text-to-image models to cumulatively generate 5984 images using our two pipelines. We then recruit women and non-binary social media users to evaluate 1200 of the images through a Likert-scale within-subjects study. Participants show high agreement in their ratings. Our results show that 86.5% of generated images depicted people with lighter skin tones, 22% contained explicit content despite Safe for Work (SFW) training, and 74% were rated as being in a younger age demographic. In particular, the images of non-binary individuals were rated as both younger and more hypersexualised, indicating troubling intersectional effects. Notably, prompts encoded with 'negative' or 'ugly' beauty traits (such as "a wide nose") consistently produced higher Not SFW (NSFW) ratings regardless of gender. This work sheds light on the pervasive demographic biases related to beauty standards present in generative AI models -- biases that are actively perpetuated by model developers, such as via negative prompting. We conclude by discussing the implications of this on society, which include pollution of the data streams and active erasure of features that do not fall inside the stereotype of what is considered beautiful by developers.

Erasing 'Ugly' from the Internet: Propagation of the Beauty Myth in Text-Image Models

TL;DR

The paper analyzes how beauty norms are encoded in two AI generation pipelines that span text-to-image and LLM-assisted image creation, using a novel beauty taxonomy to generate nearly 6,000 images and a 1,200-image human-evaluation subset. It finds pervasive biases toward lighter skin tones, younger appearances, and NSFW content, with negative prompting amplifying sexualization and non-binary targets showing especially young and sexualized traits; neutral prompts tend to yield more realistic outputs. The study highlights that safety filters alone cannot mitigate these biases, as data streams can become polluted with biased visuals and eroded diversity, raising societal concerns about lookism and data integrity. It contributes a rigorous, cross-modal analysis approach and calls for broader mitigation strategies that go beyond simple content filtering. The findings have practical implications for how generative AI should be trained, evaluated, and deployed in contexts affecting body image and representation.

Abstract

Social media has exacerbated the promotion of Western beauty norms, leading to negative self-image, particularly in women and girls, and causing harm such as body dysmorphia. Increasingly content on the internet has been artificially generated, leading to concerns that these norms are being exaggerated. The aim of this work is to study how generative AI models may encode 'beauty' and erase 'ugliness', and discuss the implications of this for society. To investigate these aims, we create two image generation pipelines: a text-to-image model and a text-to-language model-to image model. We develop a structured beauty taxonomy which we use to prompt three language models (LMs) and two text-to-image models to cumulatively generate 5984 images using our two pipelines. We then recruit women and non-binary social media users to evaluate 1200 of the images through a Likert-scale within-subjects study. Participants show high agreement in their ratings. Our results show that 86.5% of generated images depicted people with lighter skin tones, 22% contained explicit content despite Safe for Work (SFW) training, and 74% were rated as being in a younger age demographic. In particular, the images of non-binary individuals were rated as both younger and more hypersexualised, indicating troubling intersectional effects. Notably, prompts encoded with 'negative' or 'ugly' beauty traits (such as "a wide nose") consistently produced higher Not SFW (NSFW) ratings regardless of gender. This work sheds light on the pervasive demographic biases related to beauty standards present in generative AI models -- biases that are actively perpetuated by model developers, such as via negative prompting. We conclude by discussing the implications of this on society, which include pollution of the data streams and active erasure of features that do not fall inside the stereotype of what is considered beautiful by developers.

Paper Structure

This paper contains 41 sections, 1 equation, 6 figures, 17 tables.

Figures (6)

  • Figure 1: The two image generation pipelines.
  • Figure 2: Base prompt as LLM input avoiding guardrailing.
  • Figure 3: The Fitzpatrick scale used to ask participants about the skin tone of the generated image.
  • Figure 4: Summary of answers collected from 8 survey questions. The full set of questions can be seen in Table \ref{['tab:questions_for_annotators']}.
  • Figure 5: NOTE: we have censored some images, as they contain nudity. Image grid for generated images of men, women, and non-binary people. The columns indicate which pipeline and image model was used. The prompt used for each row is as follows, with polarity information if present:(a) row 1 - 'A man with no wrinkles', row 2 - 'A man with deep wrinkles', row 3 (negative) - 'A man with a hooked nose' (b) row 1 (positive) - 'A woman with a straight nose', row 2 (neutral) - 'A woman with narrow lips', row 3 (negative) - 'A overweight (excessively heavy) woman' (c) row 1 (positive) - 'A non-binary person with lush lips', row 2 (neutral) - 'A non-binary person with thin eyelids', row 3 (negative) - 'A non-binary person with large ears' .
  • ...and 1 more figures