Effect of Gender Fair Job Description on Generative AI Images
Finn Böckling, Jan Marquenie, Ingo Siegert
TL;DR
This study investigates how linguistic gender forms influence AI-generated occupational imagery in STEM and non-STEM fields using DALL-E and FLUX across German (generic masculine and pair form) and English prompts. It analyzes 150 STEM prompts plus 60 non-STEM prompts, with independent raters assessing gender representation and reliability via ICC and chi-square tests, revealing persistent male bias in STEM across forms, though the German pair form reduces bias (notably in FLUX) and English prompts remain biased toward men. Statistical evidence such as $\chi^2(1, N = 355) = 33.74$, $p < .00001$ illustrates the significance of linguistic form effects, while diversity metrics show limited non-white and non-Asian representation, especially for DALL-E, and greater within-group variation for FLUX. These findings highlight the influential role of language and training data in shaping GenAI biases and call for bias-aware prompting, diverse datasets, and broader evaluation to mitigate gender and ethnic stereotypes in AI-generated imagery.
Abstract
STEM fields are traditionally male-dominated, with gender biases shaping perceptions of job accessibility. This study analyzed gender representation in STEM occupation images generated by OpenAI DALL-E 3 \& Black Forest FLUX.1 using 150 prompts in three linguistic forms: German generic masculine, German pair form, and English. As control, 20 pictures of social occupations were generated as well. Results revealed significant male bias across all forms, with the German pair form showing reduced bias but still overrepresenting men for the STEM-Group and mixed results for the Group of Social Occupations. These findings highlight generative AI's role in reinforcing societal biases, emphasizing the need for further discussion on diversity (in AI). Further aspects analyzed are age-distribution and ethnic diversity.
