Table of Contents
Fetching ...

LLMs and Cultural Values: the Impact of Prompt Language and Explicit Cultural Framing

Bram Bulté, Ayla Rigouts Terryn

TL;DR

This study systematically evaluates how prompt language and explicit cultural framing shape LLMs' expressions of cultural values using Hofstede's VSM and the World Values Survey across 11 languages and 10 models. It shows that while targeted prompts and cultural perspectives can nudge outputs toward a country’s values, all models retain a strong bias toward a limited set of Western, secular, and affluent cultures, and combining language with cultural framing does not reliably surpass this bias. The analysis reveals that LLMs tend to produce neutral-to-progressive stances on many topics, with notable misalignment on ethically or culturally charged items and evidence of stereotypes. The work highlights the need for multilingual, culturally aware design and cautions against treating LLMs as stand-ins for real-world survey respondents, suggesting future paths toward culture-specific models or more nuanced cross-cultural evaluation frameworks.

Abstract

Large Language Models (LLMs) are rapidly being adopted by users across the globe, who interact with them in a diverse range of languages. At the same time, there are well-documented imbalances in the training data and optimisation objectives of this technology, raising doubts as to whether LLMs can represent the cultural diversity of their broad user base. In this study, we look at LLMs and cultural values and examine how prompt language and cultural framing influence model responses and their alignment with human values in different countries. We probe 10 LLMs with 63 items from the Hofstede Values Survey Module and World Values Survey, translated into 11 languages, and formulated as prompts with and without different explicit cultural perspectives. Our study confirms that both prompt language and cultural perspective produce variation in LLM outputs, but with an important caveat: While targeted prompting can, to a certain extent, steer LLM responses in the direction of the predominant values of the corresponding countries, it does not overcome the models' systematic bias toward the values associated with a restricted set of countries in our dataset: the Netherlands, Germany, the US, and Japan. All tested models, regardless of their origin, exhibit remarkably similar patterns: They produce fairly neutral responses on most topics, with selective progressive stances on issues such as social tolerance. Alignment with cultural values of human respondents is improved more with an explicit cultural perspective than with a targeted prompt language. Unexpectedly, combining both approaches is no more effective than cultural framing with an English prompt. These findings reveal that LLMs occupy an uncomfortable middle ground: They are responsive enough to changes in prompts to produce variation, but too firmly anchored to specific cultural defaults to adequately represent cultural diversity.

LLMs and Cultural Values: the Impact of Prompt Language and Explicit Cultural Framing

TL;DR

This study systematically evaluates how prompt language and explicit cultural framing shape LLMs' expressions of cultural values using Hofstede's VSM and the World Values Survey across 11 languages and 10 models. It shows that while targeted prompts and cultural perspectives can nudge outputs toward a country’s values, all models retain a strong bias toward a limited set of Western, secular, and affluent cultures, and combining language with cultural framing does not reliably surpass this bias. The analysis reveals that LLMs tend to produce neutral-to-progressive stances on many topics, with notable misalignment on ethically or culturally charged items and evidence of stereotypes. The work highlights the need for multilingual, culturally aware design and cautions against treating LLMs as stand-ins for real-world survey respondents, suggesting future paths toward culture-specific models or more nuanced cross-cultural evaluation frameworks.

Abstract

Large Language Models (LLMs) are rapidly being adopted by users across the globe, who interact with them in a diverse range of languages. At the same time, there are well-documented imbalances in the training data and optimisation objectives of this technology, raising doubts as to whether LLMs can represent the cultural diversity of their broad user base. In this study, we look at LLMs and cultural values and examine how prompt language and cultural framing influence model responses and their alignment with human values in different countries. We probe 10 LLMs with 63 items from the Hofstede Values Survey Module and World Values Survey, translated into 11 languages, and formulated as prompts with and without different explicit cultural perspectives. Our study confirms that both prompt language and cultural perspective produce variation in LLM outputs, but with an important caveat: While targeted prompting can, to a certain extent, steer LLM responses in the direction of the predominant values of the corresponding countries, it does not overcome the models' systematic bias toward the values associated with a restricted set of countries in our dataset: the Netherlands, Germany, the US, and Japan. All tested models, regardless of their origin, exhibit remarkably similar patterns: They produce fairly neutral responses on most topics, with selective progressive stances on issues such as social tolerance. Alignment with cultural values of human respondents is improved more with an explicit cultural perspective than with a targeted prompt language. Unexpectedly, combining both approaches is no more effective than cultural framing with an English prompt. These findings reveal that LLMs occupy an uncomfortable middle ground: They are responsive enough to changes in prompts to produce variation, but too firmly anchored to specific cultural defaults to adequately represent cultural diversity.

Paper Structure

This paper contains 60 sections, 1 figure, 45 tables.

Figures (1)

  • Figure 1: A schematic overview of the methodology.