Table of Contents
Fetching ...

Are Social Sentiments Inherent in LLMs? An Empirical Study on Extraction of Inter-demographic Sentiments

Kunitomo Tanaka, Ryohei Sasano, Koichi Takeda

TL;DR

This study investigates whether inter-demographic sentiments between groups defined by nationalities, religions, and races/ethnicities are embedded in LLMs. It prompts five LLMs with inter-group sentiment questions, applies a sentiment scoring process to produce values in the range $-1$ to $+1$, and validates these scores against real poll data using the Pearson correlation coefficient $\rho$. The results show positive alignment for nationalities and religions, indicating that LLM outputs reflect crowd-sourced social sentiments, while race/ethnicity signals are weaker due to data sparsity. The work highlights the potential and limits of using LLMs to study social biases, and points to multilingual and broader-demographic extensions as directions for future research.

Abstract

Large language models (LLMs) are supposed to acquire unconscious human knowledge and feelings, such as social common sense and biases, by training models from large amounts of text. However, it is not clear how much the sentiments of specific social groups can be captured in various LLMs. In this study, we focus on social groups defined in terms of nationality, religion, and race/ethnicity, and validate the extent to which sentiments between social groups can be captured in and extracted from LLMs. Specifically, we input questions regarding sentiments from one group to another into LLMs, apply sentiment analysis to the responses, and compare the results with social surveys. The validation results using five representative LLMs showed higher correlations with relatively small p-values for nationalities and religions, whose number of data points were relatively large. This result indicates that the LLM responses including the inter-group sentiments align well with actual social survey results.

Are Social Sentiments Inherent in LLMs? An Empirical Study on Extraction of Inter-demographic Sentiments

TL;DR

This study investigates whether inter-demographic sentiments between groups defined by nationalities, religions, and races/ethnicities are embedded in LLMs. It prompts five LLMs with inter-group sentiment questions, applies a sentiment scoring process to produce values in the range to , and validates these scores against real poll data using the Pearson correlation coefficient . The results show positive alignment for nationalities and religions, indicating that LLM outputs reflect crowd-sourced social sentiments, while race/ethnicity signals are weaker due to data sparsity. The work highlights the potential and limits of using LLMs to study social biases, and points to multilingual and broader-demographic extensions as directions for future research.

Abstract

Large language models (LLMs) are supposed to acquire unconscious human knowledge and feelings, such as social common sense and biases, by training models from large amounts of text. However, it is not clear how much the sentiments of specific social groups can be captured in various LLMs. In this study, we focus on social groups defined in terms of nationality, religion, and race/ethnicity, and validate the extent to which sentiments between social groups can be captured in and extracted from LLMs. Specifically, we input questions regarding sentiments from one group to another into LLMs, apply sentiment analysis to the responses, and compare the results with social surveys. The validation results using five representative LLMs showed higher correlations with relatively small p-values for nationalities and religions, whose number of data points were relatively large. This result indicates that the LLM responses including the inter-group sentiments align well with actual social survey results.
Paper Structure (18 sections, 4 figures, 2 tables)

This paper contains 18 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Motivation of our work contrasted with prior works out-of-onewhose-opinionglobalrandom.
  • Figure 2: Procedure for extraction of sentiments between social groups from LLMs.
  • Figure 3: Correlation coefficients between the actual poll result and the sentiment scores for each combination of LLMs and the set of question templates. Below them show the p-values of non-correlation test.
  • Figure 4: Sentiment scores between groups of different nationalities, extracted from GPT-4 responses, and the actual poll result. The vertical axis indicates the subject of the sentiment $G_\text{from}$ and the horizontal axis indicates the object of the sentiment $G_\text{to}$.