Are Social Sentiments Inherent in LLMs? An Empirical Study on Extraction of Inter-demographic Sentiments
Kunitomo Tanaka, Ryohei Sasano, Koichi Takeda
TL;DR
This study investigates whether inter-demographic sentiments between groups defined by nationalities, religions, and races/ethnicities are embedded in LLMs. It prompts five LLMs with inter-group sentiment questions, applies a sentiment scoring process to produce values in the range $-1$ to $+1$, and validates these scores against real poll data using the Pearson correlation coefficient $\rho$. The results show positive alignment for nationalities and religions, indicating that LLM outputs reflect crowd-sourced social sentiments, while race/ethnicity signals are weaker due to data sparsity. The work highlights the potential and limits of using LLMs to study social biases, and points to multilingual and broader-demographic extensions as directions for future research.
Abstract
Large language models (LLMs) are supposed to acquire unconscious human knowledge and feelings, such as social common sense and biases, by training models from large amounts of text. However, it is not clear how much the sentiments of specific social groups can be captured in various LLMs. In this study, we focus on social groups defined in terms of nationality, religion, and race/ethnicity, and validate the extent to which sentiments between social groups can be captured in and extracted from LLMs. Specifically, we input questions regarding sentiments from one group to another into LLMs, apply sentiment analysis to the responses, and compare the results with social surveys. The validation results using five representative LLMs showed higher correlations with relatively small p-values for nationalities and religions, whose number of data points were relatively large. This result indicates that the LLM responses including the inter-group sentiments align well with actual social survey results.
