Analyzing Cultural Representations of Emotions in LLMs through Mixed Emotion Survey
Shiran Dudy, Ibrahim Said Ahmad, Ryoko Kitajima, Agata Lapedriza
TL;DR
This study investigates how Large Language Models represent emotions across cultures by replicating Miyamoto et al.'s mixed-emotion survey in multiple languages and contexts. Using five LLMs (three open-source, two private) and three study designs, it tests English vs Japanese prompts, contextual language cues, and cross-language comparisons among East Asian and Western languages. The findings indicate only limited alignment with human data, with the written language having a stronger influence than explicit contextual cues about speaker origin, and East Asian languages showing more cross-language similarity than Western ones. The work highlights methodological avenues for assessing cultural alignment in LLMs and underscores the need for careful interpretation when using LLMs to model cross-cultural emotions, suggesting directions for more nuanced bias-aware evaluations and data-driven improvements in multilingual cultural representation.
Abstract
Large Language Models (LLMs) have gained widespread global adoption, showcasing advanced linguistic capabilities across multiple of languages. There is a growing interest in academia to use these models to simulate and study human behaviors. However, it is crucial to acknowledge that an LLM's proficiency in a specific language might not fully encapsulate the norms and values associated with its culture. Concerns have emerged regarding potential biases towards Anglo-centric cultures and values due to the predominance of Western and US-based training data. This study focuses on analyzing the cultural representations of emotions in LLMs, in the specific case of mixed-emotion situations. Our methodology is based on the studies of Miyamoto et al. (2010), which identified distinctive emotional indicators in Japanese and American human responses. We first administer their mixed emotion survey to five different LLMs and analyze their outputs. Second, we experiment with contextual variables to explore variations in responses considering both language and speaker origin. Thirdly, we expand our investigation to encompass additional East Asian and Western European origin languages to gauge their alignment with their respective cultures, anticipating a closer fit. We find that (1) models have limited alignment with the evidence in the literature; (2) written language has greater effect on LLMs' response than information on participants origin; and (3) LLMs responses were found more similar for East Asian languages than Western European languages.
