"I don't see myself represented here at all": User Experiences of Stable Diffusion Outputs Containing Representational Harms across Gender Identities and Nationalities
Sourojit Ghosh, Nina Lutz, Aylin Caliskan
TL;DR
This study probes how Stable Diffusion outputs reflect or distort users' gender and nationality identities, revealing a substantial gap between user expectations and generated images and documenting representational harms such as erasure, stereotyping, dehumanization, and disparagement. Leveraging the largest known human-subject study of a T2I to date, it combines 133 crowdworkers and 14 interviews with intra-set cosine similarity analyses and thematic coding across 136 prompts and 50 images per prompt, highlighting a pervasive misalignment in the most representative outputs ($0.83$ to $0.67$ similarity range). The findings demonstrate widespread harms affecting marginalized groups and advocate a harm-aware design framework that centers harm reduction, community-informed data practices, and harms-centric evaluation metrics, along with iterative development that accounts for downstream tasks. These contributions offer a practical pathway for reducing representational harms in T2Is and inform policy and design practices for safer, more inclusive generative systems.
Abstract
Though research into text-to-image generators (T2Is) such as Stable Diffusion has demonstrated their amplification of societal biases and potentials to cause harm, such research has primarily relied on computational methods instead of seeking information from real users who experience harm, which is a significant knowledge gap. In this paper, we conduct the largest human subjects study of Stable Diffusion, with a combination of crowdsourced data from 133 crowdworkers and 14 semi-structured interviews across diverse countries and genders. Through a mixed-methods approach of intra-set cosine similarity hierarchies (i.e., comparing multiple Stable Diffusion outputs for the same prompt with each other to examine which result is 'closest' to the prompt) and qualitative thematic analysis, we first demonstrate a large disconnect between user expectations for Stable Diffusion outputs with those generated, evidenced by a set of Stable Diffusion renditions of `a Person' providing images far away from such expectations. We then extend this finding of general dissatisfaction into highlighting representational harms caused by Stable Diffusion upon our subjects, especially those with traditionally marginalized identities, subjecting them to incorrect and often dehumanizing stereotypes about their identities. We provide recommendations for a harm-aware approach to (re)design future versions of Stable Diffusion and other T2Is.
