Table of Contents
Fetching ...

Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 Simulations

Sanguk Lee, Kai-Qi Yang, Tai-Quan Peng, Ruth Heo, Hui Liu

Abstract

Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias. To investigate this, GPT-4 was assigned personas from four societies, using data from the 2022 Gallup World Poll. These synthetic samples were then prompted with or without a commitment statement intended to induce SDR. The results were mixed. While the commitment statement increased SDR index scores, suggesting SDR bias, it reduced civic engagement scores, indicating an opposite trend. Additional findings revealed demographic associations with SDR scores and showed that the commitment statement had limited impact on GPT-4's predictive performance. The study underscores potential avenues for using LLMs to investigate biases in both humans and LLMs themselves.

Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 Simulations

Abstract

Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias. To investigate this, GPT-4 was assigned personas from four societies, using data from the 2022 Gallup World Poll. These synthetic samples were then prompted with or without a commitment statement intended to induce SDR. The results were mixed. While the commitment statement increased SDR index scores, suggesting SDR bias, it reduced civic engagement scores, indicating an opposite trend. Additional findings revealed demographic associations with SDR scores and showed that the commitment statement had limited impact on GPT-4's predictive performance. The study underscores potential avenues for using LLMs to investigate biases in both humans and LLMs themselves.

Paper Structure

This paper contains 18 sections, 4 figures.

Figures (4)

  • Figure 1: A) Estimated SDR Scores by Commitment Statement Condition across All Societies and B) Distribution of SDR Scores by Commitment Statement Condition for Each Society
  • Figure 2: A) Estimated CE by Commitment Statement Condition across All Societies and B) Distribution of CE Scores by Commitment Statement Condition for Each Society
  • Figure 3: Comparison of F1 Scores for Civic Engagement Activities between Commitment Statement Conditions. Note. Macro-F1 scores were reported for Overall Civic Engagement. For Charity Donation and Volunteer, F1 scores were computed based on “No” response. For Help Stranger, F1 scores were computed based on “Yes” response.
  • Figure 4: Stacked Bar Plots of Survey and Synthetic Sample Responses Regarding A) Charity Donations, B) Volunteer Time to Organizations, and C) Helping Strangers Across Four Societies.