Table of Contents
Fetching ...

Misalignment of LLM-Generated Personas with Human Perceptions in Low-Resource Settings

Tabia Tanzin Prama, Christopher M. Danforth, Peter Sheridan Dodds

TL;DR

Misalignment of LLM-generated personas with human perceptions in low-resource settings examines whether LLMs can authentically generate culturally situated personas in Bangladesh and how humans perceive those personas. The authors construct eight personas across religion, gender, and political affiliation and evaluate responses to culturally anchored questions using a 100-question dataset and the Persona Perception Scale. Results show a persistent gap between human and LLM performance, with humans achieving much higher accuracy and PPS scores, and LLMs exhibiting a Pollyanna-style positivity bias in sentiment. The study highlights risks of deploying LLM-generated personas in social science research without validation against real human data and calls for careful calibration in low-resource, culturally diverse contexts.

Abstract

Recent advances enable Large Language Models (LLMs) to generate AI personas, yet their lack of deep contextual, cultural, and emotional understanding poses a significant limitation. This study quantitatively compared human responses with those of eight LLM-generated social personas (e.g., Male, Female, Muslim, Political Supporter) within a low-resource environment like Bangladesh, using culturally specific questions. Results show human responses significantly outperform all LLMs in answering questions, and across all matrices of persona perception, with particularly large gaps in empathy and credibility. Furthermore, LLM-generated content exhibited a systematic bias along the lines of the ``Pollyanna Principle'', scoring measurably higher in positive sentiment ($Φ_{avg} = 5.99$ for LLMs vs. $5.60$ for Humans). These findings suggest that LLM personas do not accurately reflect the authentic experience of real people in resource-scarce environments. It is essential to validate LLM personas against real-world human data to ensure their alignment and reliability before deploying them in social science research.

Misalignment of LLM-Generated Personas with Human Perceptions in Low-Resource Settings

TL;DR

Misalignment of LLM-generated personas with human perceptions in low-resource settings examines whether LLMs can authentically generate culturally situated personas in Bangladesh and how humans perceive those personas. The authors construct eight personas across religion, gender, and political affiliation and evaluate responses to culturally anchored questions using a 100-question dataset and the Persona Perception Scale. Results show a persistent gap between human and LLM performance, with humans achieving much higher accuracy and PPS scores, and LLMs exhibiting a Pollyanna-style positivity bias in sentiment. The study highlights risks of deploying LLM-generated personas in social science research without validation against real human data and calls for careful calibration in low-resource, culturally diverse contexts.

Abstract

Recent advances enable Large Language Models (LLMs) to generate AI personas, yet their lack of deep contextual, cultural, and emotional understanding poses a significant limitation. This study quantitatively compared human responses with those of eight LLM-generated social personas (e.g., Male, Female, Muslim, Political Supporter) within a low-resource environment like Bangladesh, using culturally specific questions. Results show human responses significantly outperform all LLMs in answering questions, and across all matrices of persona perception, with particularly large gaps in empathy and credibility. Furthermore, LLM-generated content exhibited a systematic bias along the lines of the ``Pollyanna Principle'', scoring measurably higher in positive sentiment ( for LLMs vs. for Humans). These findings suggest that LLM personas do not accurately reflect the authentic experience of real people in resource-scarce environments. It is essential to validate LLM personas against real-world human data to ensure their alignment and reliability before deploying them in social science research.

Paper Structure

This paper contains 8 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Examining the Discrepancy in Alignment Between LLM-Generated and Human Personas in Low-Resource Contexts: A Case Study of Bangladesh.
  • Figure 2: Comparison of human and LLMs performance on persona perception tasks.(a) The average accuracy of seven LLMs compared with human evaluators, (b) The mean scores and standard deviations for six persona perception dimensions of PPS.
  • Figure 3: Word shift graph of word frequencies in happiness of real human and LLM-generated personas' responses. Words are ranked by their percentage contribution to the change in average happiness, $\Phi_{\text{avg}}$. The real human responses are set as the reference text $T_\textnormal{ref}$, with the respective LLM-generated personas' responses as the comparison text $T_\textnormal{comp}$. Individual word contributions to the shift are indicated by two symbols: $+/-$ shows the word is more/less prevalent in $T_\textnormal{comp}$ than in $T_\textnormal{ref}$. The four bars on the top indicate the total contribution of the four types of words $(+ \uparrow, + \downarrow, - \uparrow, -\downarrow)$. Relative text size is represented by the areas of the gray squares.
  • Figure 4: Topical distribution of questions in the dataset.
  • Figure 5: Accuracy (%) of seven LLMs compared to the human baseline (red dashed line) across three key sociocultural axes: Political Affiliation (Top), Gender (Middle), and Religious Identity (Bottom), based on contextually sensitive queries in Bangladesh
  • ...and 1 more figures