Measuring Political Bias in Large Language Models: What Is Said and How It Is Said

Yejin Bang; Delong Chen; Nayeon Lee; Pascale Fung

Measuring Political Bias in Large Language Models: What Is Said and How It Is Said

Yejin Bang, Delong Chen, Nayeon Lee, Pascale Fung

Abstract

We propose to measure political bias in LLMs by analyzing both the content and style of their generated content regarding political issues. Existing benchmarks and measures focus on gender and racial biases. However, political bias exists in LLMs and can lead to polarization and other harms in downstream applications. In order to provide transparency to users, we advocate that there should be fine-grained and explainable measures of political biases generated by LLMs. Our proposed measure looks at different political issues such as reproductive rights and climate change, at both the content (the substance of the generation) and the style (the lexical polarity) of such bias. We measured the political bias in eleven open-sourced LLMs and showed that our proposed framework is easily scalable to other topics and is explainable.

Measuring Political Bias in Large Language Models: What Is Said and How It Is Said

Abstract

Paper Structure (45 sections, 2 equations, 10 figures, 2 tables)

This paper contains 45 sections, 2 equations, 10 figures, 2 tables.

Introduction
The Framework
Political Stance Analysis
Framing Bias Analysis
Framework Implementation
Political Stance Analysis
Political Topics
Task Instruction
Reference Anchor Generation
Distance Function
Stance Estimation
Framing Bias Analysis
Frame Selection for Content Bias Analysis
i. Boydstun's Frame Dimensions
ii. NER-based Frame Extraction
...and 30 more sections

Figures (10)

Figure 1: An overview of our proposed framework for measuring political bias in LLM-generated content. The two-tiered framework first evaluates the LLM's political stance over political topics and then framing bias in two aspects: content and style.
Figure 2: Overview of our proposed evaluation framework. Top row: We analyze the political stance of $f_\text{LLM}$ on specific topics by comparing the distribution of its generated content, $P(\hat{Y})$, with a pair of reference distributions, $P(Y_\text{pro})$ and $P(Y_\text{opp})$. These reference distributions correspond to two opposing political stances on certain topics. Bottom row: We further investigate framing bias by decomposing it into content bias and style bias. To achieve this, we employ a latent variable model to describe the model generation process. We then analyze two types of biases based on the identified content variable $C$ and style variable $S$.
Figure 3: Visualization of the process of political stance analysis. The model generations about different political topics are visualized using TSNE. For each pair, the left-hand side refers to the distributions of model generation on different topics $P(\hat{Y})$, which is to be analyzed for measuring political bias, and the right-hand side shows the reference extreme anchor distributions $P(Y_\text{pro}), P(Y_\text{opp})$ for stance analysis (e.g., proponents and opponents). For instance, on the left-top corner about reproductive rights, the model generation distribution (grey color) overlaps with the proponent distribution, which refers to the model showing an advocacy stance on reproductive rights.
Figure 4: Heatmap showcasing stances (red for opposition, blue support, white for neutrality) and the norm of stance vector $\|\vec{s}\|$ (numbers) of eleven LLMs across ten political issues. The scores in each cell are in percentage (%). Variances in each model's stance and intensity are evident, as seen in LLama-2-13b-chat's 23% support for Same-sex Marriage and 3.2% opposition to the Death Penalty. The higher the score the more it is biased to one stance.
Figure 5: Entity-Based Frame Analysis. Left: Comparison of entity mentions frequencies across models, normalized by the average mentions across eleven models. "Avg." denotes the mean mention count. Right: Visualization of the top-10 entities for three models (Jais (13B), LLaMa2 (13B), Yi (34B)), with circle sizes indicating mention frequency and colors representing sentiment (positive, negative, and grey for neutral). Dashed borders indicate unique entities. For example, only Jais mentions the UAE neutrally, while both Jais and Yi negatively highlight the "Same Sex Marriage Ban."
...and 5 more figures

Measuring Political Bias in Large Language Models: What Is Said and How It Is Said

Abstract

Measuring Political Bias in Large Language Models: What Is Said and How It Is Said

Authors

Abstract

Table of Contents

Figures (10)