Representation Bias in Political Sample Simulations with Large Language Models

Weihong Qi; Hanjia Lyu; Jiebo Luo

Representation Bias in Political Sample Simulations with Large Language Models

Weihong Qi, Hanjia Lyu, Jiebo Luo

TL;DR

The paper addresses representation bias in using LLMs to simulate political samples, focusing on vote choice and public opinion. It applies GPT-3.5-Turbo to data from ANES, GLES, Zuobiao, and CFPS, evaluating biases across language, demographics, and regime type via an Agreement Score $\text{Agreement Score} = \frac{\sum_i S_{i,\text{Agree}}}{S_{\text{total}}}$ with $S_{i,\text{Agree}}$ indicating match quality. Key findings show higher accuracy for vote choice than for public opinion, with stronger performance in English-speaking, bipartisan, and democratic contexts and with older age groups; non-English, multi-party, and autocratic contexts yield poorer results, especially for Chinese samples. The results highlight biases in AI-driven social science simulations and motivate diversified multilingual training data and methodological improvements to enhance fairness across political contexts.

Abstract

This study seeks to identify and quantify biases in simulating political samples with Large Language Models, specifically focusing on vote choice and public opinion. Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao Dataset, and China Family Panel Studies to simulate voting behaviors and public opinions. This methodology enables us to examine three types of representation bias: disparities based on the the country's language, demographic groups, and political regime types. The findings reveal that simulation performance is generally better for vote choice than for public opinions, more accurate in English-speaking countries, more effective in bipartisan systems than in multi-partisan systems, and stronger in democratic settings than in authoritarian regimes. These results contribute to enhancing our understanding and developing strategies to mitigate biases in AI applications within the field of computational social science.

Representation Bias in Political Sample Simulations with Large Language Models

TL;DR

with

indicating match quality. Key findings show higher accuracy for vote choice than for public opinion, with stronger performance in English-speaking, bipartisan, and democratic contexts and with older age groups; non-English, multi-party, and autocratic contexts yield poorer results, especially for Chinese samples. The results highlight biases in AI-driven social science simulations and motivate diversified multilingual training data and methodological improvements to enhance fairness across political contexts.

Abstract

Paper Structure (7 sections, 3 figures)

This paper contains 7 sections, 3 figures.

Introduction
Related Work
Methods
Results
Vote Choice
Public Opinion
Discussions and Conclusions

Figures (3)

Figure 1: Simulation results for US and German samples in vote choice across demographic groups. A higher agreement score signifies greater similarity between the actual responses from human survey participants and those generated by the simulation.
Figure 2: Simulation results for US and German samples in vote choice across parties. A higher agreement score signifies greater similarity between the actual responses from human survey participants and those generated by the simulation.
Figure 3: Simulation results for US and Chinese samples in public opinion regarding different political issues. A higher agreement score signifies greater similarity between the actual responses from human survey participants and those generated by the simulation.

Representation Bias in Political Sample Simulations with Large Language Models

TL;DR

Abstract

Representation Bias in Political Sample Simulations with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (3)