Table of Contents
Fetching ...

A Cautionary Tale About "Neutrally" Informative AI Tools Ahead of the 2025 Federal Elections in Germany

Ina Dormuth, Sven Franke, Marlies Hafer, Tim Katzke, Alexander Marx, Emmanuel Müller, Daniel Neider, Markus Pauly, Jérôme Rutinowski

TL;DR

This work assesses the reliability of AI-based VAAs and LLMs for political information ahead of Germany's 2025 election by benchmarking against Wahl-O-Mat statements. It employs a stochastic, multi-model evaluation (ChatGPT 4o, DeepSeek V3, and DeepSeek R1) with five repetitions per prompt, using Retrieval-Augmented Generation and prompt variations to measure alignment with major parties. Key findings show a pronounced left-leaning bias in LLM responses (e.g., Greens and SPD around $79$–$86 ext{ extpercent}$) and substantial deviations/hallucinations in VAAs (Wahl.Chat deviates in $25 ext{ extpercent}$ of cases; WAHLWEISE in $54 ext{ extpercent}$). The results underscore critical risks of deploying LLM-based VAAs in electoral contexts, necessitating rigorous certification, scrutiny of prompt sensitivity, and mechanisms to ensure factual alignment with party positions.

Abstract

In this study, we examine the reliability of AI-based Voting Advice Applications (VAAs) and large language models (LLMs) in providing objective political information. Our analysis is based upon a comparison with party responses to 38 statements of the Wahl-O-Mat, a well-established German online tool that helps inform voters by comparing their views with political party positions. For the LLMs, we identify significant biases. They exhibit a strong alignment (over 75% on average) with left-wing parties and a substantially lower alignment with center-right (smaller 50%) and right-wing parties (around 30%). Furthermore, for the VAAs, intended to objectively inform voters, we found substantial deviations from the parties' stated positions in Wahl-O-Mat: While one VAA deviated in 25% of cases, another VAA showed deviations in more than 50% of cases. For the latter, we even observed that simple prompt injections led to severe hallucinations, including false claims such as non-existent connections between political parties and right-wing extremist ties.

A Cautionary Tale About "Neutrally" Informative AI Tools Ahead of the 2025 Federal Elections in Germany

TL;DR

This work assesses the reliability of AI-based VAAs and LLMs for political information ahead of Germany's 2025 election by benchmarking against Wahl-O-Mat statements. It employs a stochastic, multi-model evaluation (ChatGPT 4o, DeepSeek V3, and DeepSeek R1) with five repetitions per prompt, using Retrieval-Augmented Generation and prompt variations to measure alignment with major parties. Key findings show a pronounced left-leaning bias in LLM responses (e.g., Greens and SPD around ) and substantial deviations/hallucinations in VAAs (Wahl.Chat deviates in of cases; WAHLWEISE in ). The results underscore critical risks of deploying LLM-based VAAs in electoral contexts, necessitating rigorous certification, scrutiny of prompt sensitivity, and mechanisms to ensure factual alignment with party positions.

Abstract

In this study, we examine the reliability of AI-based Voting Advice Applications (VAAs) and large language models (LLMs) in providing objective political information. Our analysis is based upon a comparison with party responses to 38 statements of the Wahl-O-Mat, a well-established German online tool that helps inform voters by comparing their views with political party positions. For the LLMs, we identify significant biases. They exhibit a strong alignment (over 75% on average) with left-wing parties and a substantially lower alignment with center-right (smaller 50%) and right-wing parties (around 30%). Furthermore, for the VAAs, intended to objectively inform voters, we found substantial deviations from the parties' stated positions in Wahl-O-Mat: While one VAA deviated in 25% of cases, another VAA showed deviations in more than 50% of cases. For the latter, we even observed that simple prompt injections led to severe hallucinations, including false claims such as non-existent connections between political parties and right-wing extremist ties.

Paper Structure

This paper contains 13 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The first political statement as presented on the https://www.wahl-o-mat.de/bundestagswahl2025/app/main_app.html for the 2025 German federal election. (https://www.wahl-o-mat.de/bundestagswahl2025/app/main_app.html).
  • Figure 2: Weighted agreement of the three LLMs with the parties (in %) with variations.
  • Figure 3: Consistency (in %) between party positions in the Wahl-O-Mat and those attributed by WAHLWEISE.
  • Figure 4: Frequency of party positions in the Wahl-O-Mat compared to WAHLWEISE.
  • Figure 5: Response from WAHLWEISE on the SPD's position regarding the first Wahl-O-Mat statement on further military support for Ukraine (Approving in Wahl-O-Mat and explicitly mentioned in their election program).
  • ...and 1 more figures