Table of Contents
Fetching ...

Modeling Human Subjectivity in LLMs Using Explicit and Implicit Human Factors in Personas

Salvatore Giorgi, Tingting Liu, Ankit Aich, Kelsey Isman, Garrick Sherman, Zachary Fried, João Sedoc, Lyle H. Ungar, Brenda Curtis

TL;DR

LLMs may capture the statistical patterns of how people speak, but are generally unable to model the complex interactions and subtleties of human perceptions, potentially limiting their effectiveness in social science applications.

Abstract

Large language models (LLMs) are increasingly being used in human-centered social scientific tasks, such as data annotation, synthetic data creation, and engaging in dialog. However, these tasks are highly subjective and dependent on human factors, such as one's environment, attitudes, beliefs, and lived experiences. Thus, it may be the case that employing LLMs (which do not have such human factors) in these tasks results in a lack of variation in data, failing to reflect the diversity of human experiences. In this paper, we examine the role of prompting LLMs with human-like personas and asking the models to answer as if they were a specific human. This is done explicitly, with exact demographics, political beliefs, and lived experiences, or implicitly via names prevalent in specific populations. The LLM personas are then evaluated via (1) subjective annotation task (e.g., detecting toxicity) and (2) a belief generation task, where both tasks are known to vary across human factors. We examine the impact of explicit vs. implicit personas and investigate which human factors LLMs recognize and respond to. Results show that explicit LLM personas show mixed results when reproducing known human biases, but generally fail to demonstrate implicit biases. We conclude that LLMs may capture the statistical patterns of how people speak, but are generally unable to model the complex interactions and subtleties of human perceptions, potentially limiting their effectiveness in social science applications.

Modeling Human Subjectivity in LLMs Using Explicit and Implicit Human Factors in Personas

TL;DR

LLMs may capture the statistical patterns of how people speak, but are generally unable to model the complex interactions and subtleties of human perceptions, potentially limiting their effectiveness in social science applications.

Abstract

Large language models (LLMs) are increasingly being used in human-centered social scientific tasks, such as data annotation, synthetic data creation, and engaging in dialog. However, these tasks are highly subjective and dependent on human factors, such as one's environment, attitudes, beliefs, and lived experiences. Thus, it may be the case that employing LLMs (which do not have such human factors) in these tasks results in a lack of variation in data, failing to reflect the diversity of human experiences. In this paper, we examine the role of prompting LLMs with human-like personas and asking the models to answer as if they were a specific human. This is done explicitly, with exact demographics, political beliefs, and lived experiences, or implicitly via names prevalent in specific populations. The LLM personas are then evaluated via (1) subjective annotation task (e.g., detecting toxicity) and (2) a belief generation task, where both tasks are known to vary across human factors. We examine the impact of explicit vs. implicit personas and investigate which human factors LLMs recognize and respond to. Results show that explicit LLM personas show mixed results when reproducing known human biases, but generally fail to demonstrate implicit biases. We conclude that LLMs may capture the statistical patterns of how people speak, but are generally unable to model the complex interactions and subtleties of human perceptions, potentially limiting their effectiveness in social science applications.
Paper Structure (37 sections, 2 figures, 11 tables, 1 algorithm)

This paper contains 37 sections, 2 figures, 11 tables, 1 algorithm.

Figures (2)

  • Figure 1: Flow diagram for comparing personas, using an example of explicit gender vs implicit gender in the parenting domain. We first prompt the 641 Persona-LLMs each with the two personas we are comparing (explicit $e$ and implicit $i$) and ask each the relevant domain question for a total of 2*641 generations. We then extract n-grams for each generation, where $m$ denotes the total number of n-grams. Next, we correlate each of the $m$ ngrams with the human factor labels for each persona type, for $2*m$ correlations. Finally, we correlate the correlations across the persona types (two vectors of correlations, each of size $m$) giving us a final similarity metric.
  • Figure 2: Belief Generation Task (BGT1) Ngrams correlated with (a) age, (b) gender, (c) political ideology, (d) race, and (e) substance use using text generated from their respective domains. All correlations are significant at a BH corrected $p<.05$. Size of the word reflects its correlation strength (larger words are more correlated with the human factor), color indicates the ngram's frequency in the data set (gray = low frequency, blue = moderate frequency, red = high frequency). Exact effect sizes are shown in Table \ref{['tab:ngram effect sizes']}.