Table of Contents
Fetching ...

AI's Regimes of Representation: A Community-centered Study of Text-to-Image Models in South Asia

Rida Qadri, Renee Shelby, Cynthia L. Bennett, Remi Denton

TL;DR

It is shown how generative AI can reproduce an outsiders gaze for viewing South Asian cultures, shaped by global and regional power inequities, within participants’ reporting of their existing social marginalizations.

Abstract

This paper presents a community-centered study of cultural limitations of text-to-image (T2I) models in the South Asian context. We theorize these failures using scholarship on dominant media regimes of representations and locate them within participants' reporting of their existing social marginalizations. We thus show how generative AI can reproduce an outsiders gaze for viewing South Asian cultures, shaped by global and regional power inequities. By centering communities as experts and soliciting their perspectives on T2I limitations, our study adds rich nuance into existing evaluative frameworks and deepens our understanding of the culturally-specific ways AI technologies can fail in non-Western and Global South settings. We distill lessons for responsible development of T2I models, recommending concrete pathways forward that can allow for recognition of structural inequalities.

AI's Regimes of Representation: A Community-centered Study of Text-to-Image Models in South Asia

TL;DR

It is shown how generative AI can reproduce an outsiders gaze for viewing South Asian cultures, shaped by global and regional power inequities, within participants’ reporting of their existing social marginalizations.

Abstract

This paper presents a community-centered study of cultural limitations of text-to-image (T2I) models in the South Asian context. We theorize these failures using scholarship on dominant media regimes of representations and locate them within participants' reporting of their existing social marginalizations. We thus show how generative AI can reproduce an outsiders gaze for viewing South Asian cultures, shaped by global and regional power inequities. By centering communities as experts and soliciting their perspectives on T2I limitations, our study adds rich nuance into existing evaluative frameworks and deepens our understanding of the culturally-specific ways AI technologies can fail in non-Western and Global South settings. We distill lessons for responsible development of T2I models, recommending concrete pathways forward that can allow for recognition of structural inequalities.
Paper Structure (33 sections, 9 figures, 4 tables)

This paper contains 33 sections, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Example of a DALL-E generated image for the prompt "Heer Ranjha" (left) juxtaposed with a canonical representations of Heer Ranjha heerranjha (right) showing the generated image resembles a monarch or warrior instead of a couple from rural Punjab.
  • Figure 2: Generated images, from Imagen and Parti, for prompt "A photo of a house of worship" showing Western-looking churches.
  • Figure 3: Generated images, from Stable Diffusion and DALL-E, for prompt "People spending their day in Peshawar" showing dusty streets and markers of poverty and none of Peshawar’s rich cultural heritage.
  • Figure 4: Example of (a) a Parti generated image for the prompt "Baitul Mukarram National Masjid" juxtaposed with (b) a photograph of Baitul Mukarram National Masjid Baitul. The generated image lacks the distinct architectural features of Baitul Mukarram National Masjid.
  • Figure 5: Example of (a) a DALL-E generated image for the prompt "A woman in the style of artist Sadequain" juxtaposed with (b) a painting by Sadequain Sadequain. The generated image does not capture the artist's style, notably lacking the distinctive calligraphy and color themes.
  • ...and 4 more figures