Table of Contents
Fetching ...

Assessing Political Bias in Large Language Models

Luca Rettenberger, Markus Reischl, Mark Schutera

TL;DR

The paper investigates political bias in open-source Large Language Models by measuring their alignment with German party positions using the Wahl-O-Mat framework in the context of the 2024 European Parliament elections. It compares multiple models across German and English prompts, revealing language- and model-size dependent biases, with Llama3-70B showing strong left-leaning alignment in both languages and AfD alignment remaining consistently low. The results demonstrate that language input significantly shapes perceived bias, suggesting that model capacity and training data influence how political content is generated at scale. The work highlights the necessity of bias transparency, robust evaluation, and human-in-the-loop safeguards to protect democratic processes while enabling the constructive use of AI in political contexts.

Abstract

The assessment of bias within Large Language Models (LLMs) has emerged as a critical concern in the contemporary discourse surrounding Artificial Intelligence (AI) in the context of their potential impact on societal dynamics. Recognizing and considering political bias within LLM applications is especially important when closing in on the tipping point toward performative prediction. Then, being educated about potential effects and the societal behavior LLMs can drive at scale due to their interplay with human operators. In this way, the upcoming elections of the European Parliament will not remain unaffected by LLMs. We evaluate the political bias of the currently most popular open-source LLMs (instruct or assistant models) concerning political issues within the European Union (EU) from a German voter's perspective. To do so, we use the "Wahl-O-Mat," a voting advice application used in Germany. From the voting advice of the "Wahl-O-Mat" we quantize the degree of alignment of LLMs with German political parties. We show that larger models, such as Llama3-70B, tend to align more closely with left-leaning political parties, while smaller models often remain neutral, particularly when prompted in English. The central finding is that LLMs are similarly biased, with low variances in the alignment concerning a specific party. Our findings underline the importance of rigorously assessing and making bias transparent in LLMs to safeguard the integrity and trustworthiness of applications that employ the capabilities of performative prediction and the invisible hand of machine learning prediction and language generation.

Assessing Political Bias in Large Language Models

TL;DR

The paper investigates political bias in open-source Large Language Models by measuring their alignment with German party positions using the Wahl-O-Mat framework in the context of the 2024 European Parliament elections. It compares multiple models across German and English prompts, revealing language- and model-size dependent biases, with Llama3-70B showing strong left-leaning alignment in both languages and AfD alignment remaining consistently low. The results demonstrate that language input significantly shapes perceived bias, suggesting that model capacity and training data influence how political content is generated at scale. The work highlights the necessity of bias transparency, robust evaluation, and human-in-the-loop safeguards to protect democratic processes while enabling the constructive use of AI in political contexts.

Abstract

The assessment of bias within Large Language Models (LLMs) has emerged as a critical concern in the contemporary discourse surrounding Artificial Intelligence (AI) in the context of their potential impact on societal dynamics. Recognizing and considering political bias within LLM applications is especially important when closing in on the tipping point toward performative prediction. Then, being educated about potential effects and the societal behavior LLMs can drive at scale due to their interplay with human operators. In this way, the upcoming elections of the European Parliament will not remain unaffected by LLMs. We evaluate the political bias of the currently most popular open-source LLMs (instruct or assistant models) concerning political issues within the European Union (EU) from a German voter's perspective. To do so, we use the "Wahl-O-Mat," a voting advice application used in Germany. From the voting advice of the "Wahl-O-Mat" we quantize the degree of alignment of LLMs with German political parties. We show that larger models, such as Llama3-70B, tend to align more closely with left-leaning political parties, while smaller models often remain neutral, particularly when prompted in English. The central finding is that LLMs are similarly biased, with low variances in the alignment concerning a specific party. Our findings underline the importance of rigorously assessing and making bias transparent in LLMs to safeguard the integrity and trustworthiness of applications that employ the capabilities of performative prediction and the invisible hand of machine learning prediction and language generation.
Paper Structure (10 sections, 5 figures, 1 table)

This paper contains 10 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: One political statement shown in the Wahl-O-Mat web interface for the 2024 European Parliament elections that translates to "The EU should be allowed to levy its own taxes."
  • Figure 2: Answers of the evaluated models. Red indicates rejection of a statement, yellow neutrality, and green agreement. Each statement is detailed in \ref{['tab:all_statements']}.
  • Figure 3: Alignments of the LLMs with the political parties currently represented in the European Parliament. The alignment is obtained by querying the Wahl-O-Mat with the LLMs. When prompted in English, the Llama2-7B model can not be evaluated through the Wahl-O-Mat as consistent neutral responses do not allow an estimation of alignment with the parties.
  • Figure 4: Box-whisker plots showing how the LLMs align across all political parties. Outliers are marked with dots. LLMs exhibit consistent political bias characteristics, as evidenced by the alignment variances relative to specific political parties: Mean standard deviation over all parties when prompted, in German: $\pm 7.40\%$, and in English: $\pm 5.18\%$.
  • Figure 5: Allocation of seats in the European Parliament based on the mean alignment of the LLMs using proportional representation mill1862true. Distribution of seats: Grüne/EFA (GRÜNE, ÖDP, Piraten, Volt, Die PARTEI$^1$) 280 seats, EVP (CDU, CSU, FAMILIE): 128 seats, Renew (FDP, FREIE WÄHLER): 95, Non-Inscrits 76 Seats, GUE/NGL (Die LINKE) 61 seats, S&D (SPD) 51 seats, and ID (AfD) 29 seats.