Table of Contents
Fetching ...

Risks of Cultural Erasure in Large Language Models

Rida Qadri, Aida M. Davani, Kevin Robinson, Vinodkumar Prabhakaran

TL;DR

Large language models shape access to global cultures, prompting the need for benchmarks that diagnose cultural erasure. The authors develop an erasure lens with two concepts—omission (lack of representation) and simplification (one-dimensional portrayal)—and test it on two tasks: describing locales and generating travel recommendations, using cities as standardized proxies. Across PaLM and PaLM 2-based scenarios, they demonstrate Western-centric cultural representations and notable regional omissions, with Africa and parts of Asia often overrepresented economically but underrepresented culturally; within-region diversity is frequently collapsed to a few cities. The work argues for culturally informed evaluation frameworks and interdisciplinary collaboration to create more inclusive generative AI systems and benchmarks that reflect power dynamics in global cultural production.

Abstract

Large language models are increasingly being integrated into applications that shape the production and discovery of societal knowledge such as search, online education, and travel planning. As a result, language models will shape how people learn about, perceive and interact with global cultures making it important to consider whose knowledge systems and perspectives are represented in models. Recognizing this importance, increasingly work in Machine Learning and NLP has focused on evaluating gaps in global cultural representational distribution within outputs. However, more work is needed on developing benchmarks for cross-cultural impacts of language models that stem from a nuanced sociologically-aware conceptualization of cultural impact or harm. We join this line of work arguing for the need of metricizable evaluations of language technologies that interrogate and account for historical power inequities and differential impacts of representation on global cultures, particularly for cultures already under-represented in the digital corpora. We look at two concepts of erasure: omission: where cultures are not represented at all and simplification i.e. when cultural complexity is erased by presenting one-dimensional views of a rich culture. The former focuses on whether something is represented, and the latter on how it is represented. We focus our analysis on two task contexts with the potential to influence global cultural production. First, we probe representations that a language model produces about different places around the world when asked to describe these contexts. Second, we analyze the cultures represented in the travel recommendations produced by a set of language model applications. Our study shows ways in which the NLP community and application developers can begin to operationalize complex socio-cultural considerations into standard evaluations and benchmarks.

Risks of Cultural Erasure in Large Language Models

TL;DR

Large language models shape access to global cultures, prompting the need for benchmarks that diagnose cultural erasure. The authors develop an erasure lens with two concepts—omission (lack of representation) and simplification (one-dimensional portrayal)—and test it on two tasks: describing locales and generating travel recommendations, using cities as standardized proxies. Across PaLM and PaLM 2-based scenarios, they demonstrate Western-centric cultural representations and notable regional omissions, with Africa and parts of Asia often overrepresented economically but underrepresented culturally; within-region diversity is frequently collapsed to a few cities. The work argues for culturally informed evaluation frameworks and interdisciplinary collaboration to create more inclusive generative AI systems and benchmarks that reflect power dynamics in global cultural production.

Abstract

Large language models are increasingly being integrated into applications that shape the production and discovery of societal knowledge such as search, online education, and travel planning. As a result, language models will shape how people learn about, perceive and interact with global cultures making it important to consider whose knowledge systems and perspectives are represented in models. Recognizing this importance, increasingly work in Machine Learning and NLP has focused on evaluating gaps in global cultural representational distribution within outputs. However, more work is needed on developing benchmarks for cross-cultural impacts of language models that stem from a nuanced sociologically-aware conceptualization of cultural impact or harm. We join this line of work arguing for the need of metricizable evaluations of language technologies that interrogate and account for historical power inequities and differential impacts of representation on global cultures, particularly for cultures already under-represented in the digital corpora. We look at two concepts of erasure: omission: where cultures are not represented at all and simplification i.e. when cultural complexity is erased by presenting one-dimensional views of a rich culture. The former focuses on whether something is represented, and the latter on how it is represented. We focus our analysis on two task contexts with the potential to influence global cultural production. First, we probe representations that a language model produces about different places around the world when asked to describe these contexts. Second, we analyze the cultures represented in the travel recommendations produced by a set of language model applications. Our study shows ways in which the NLP community and application developers can begin to operationalize complex socio-cultural considerations into standard evaluations and benchmarks.
Paper Structure (23 sections, 4 figures, 3 tables)

This paper contains 23 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Culture and economy themes in representations of places within each region. Scores on the y-axis are the probability of the theme being present in a representation of a place within that region. Presence of a theme is determined by a majority vote across three raters.
  • Figure 2: Bar chart showing the presence of cultural and economic themes in descriptions of various cities with cities from Africa and Asia in bold; the graph shows that while 9 out of the 10 cities with the least cultural representation are from Africa and Asia, these regions are only represented with 2 cities in the list of 10 cities with highest cultural representation; on the other hand, while 2 out of the 10 cities with least economic representations are from Africa and Asia, these regions are represented with 7 cities in the list of 10 cities with the highest representation on metrics of economy/economic conditions.
  • Figure 3: The graphs show the frequency of each region being recommended as a travel destination when a specific interest area (e.g. art, or museum) is mentioned.
  • Figure 4: Distribution of references to cities within each region, across different application contexts.