Table of Contents
Fetching ...

Investigating LLMs as Voting Assistants via Contextual Augmentation: A Case Study on the European Parliament Elections 2024

Ilias Chalkidis

TL;DR

This work audits MISTRAL and MIXTRAL models and evaluates their accuracy in predicting the stance of political parties based on the latest “EU and I” voting assistance questionnaire, finding that MIXTRAL is highly accurate with an 82% accuracy on average.

Abstract

In light of the recent 2024 European Parliament elections, we are investigating if LLMs can be used as Voting Advice Applications (VAAs). We audit MISTRAL and MIXTRAL models and evaluate their accuracy in predicting the stance of political parties based on the latest "EU and I" voting assistance questionnaire. Furthermore, we explore alternatives to improve models' performance by augmenting the input context via Retrieval-Augmented Generation (RAG) relying on web search, and Self-Reflection using staged conversations that aim to re-collect relevant content from the model's internal memory. We find that MIXTRAL is highly accurate with an 82% accuracy on average with a significant performance disparity across different political groups (50-95%). Augmenting the input context with expert-curated information can lead to a significant boost of approx. 9%, which remains an open challenge for automated RAG approaches, even considering curated content.

Investigating LLMs as Voting Assistants via Contextual Augmentation: A Case Study on the European Parliament Elections 2024

TL;DR

This work audits MISTRAL and MIXTRAL models and evaluates their accuracy in predicting the stance of political parties based on the latest “EU and I” voting assistance questionnaire, finding that MIXTRAL is highly accurate with an 82% accuracy on average.

Abstract

In light of the recent 2024 European Parliament elections, we are investigating if LLMs can be used as Voting Advice Applications (VAAs). We audit MISTRAL and MIXTRAL models and evaluate their accuracy in predicting the stance of political parties based on the latest "EU and I" voting assistance questionnaire. Furthermore, we explore alternatives to improve models' performance by augmenting the input context via Retrieval-Augmented Generation (RAG) relying on web search, and Self-Reflection using staged conversations that aim to re-collect relevant content from the model's internal memory. We find that MIXTRAL is highly accurate with an 82% accuracy on average with a significant performance disparity across different political groups (50-95%). Augmenting the input context with expert-curated information can lead to a significant boost of approx. 9%, which remains an open challenge for automated RAG approaches, even considering curated content.
Paper Structure (30 sections, 5 figures, 8 tables)

This paper contains 30 sections, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Depiction of the experimental framework. In Setting (0), there is no context augmentation. In Setting (A) the context is augmented using web search to retrieve relevant content. In Setting (B), the context is self-augmented by asking the model preliminary questions to generate a summary for the party and its expected opinion related to the question. In Setting (C), the input context is augmented with the party's position related to the question.
  • Figure 2: Accuracy of the examined models (Mistral in blue, and Mixtral in orange) on EUandI-2024 dataset across all settings (Section \ref{['sec:settings']}) and examined groups (4 EU Member States + euro-parties).
  • Figure 3: Accuracy of Mixtral on different sub-settings of Setting B: Self-Augmented Context.
  • Figure 4: Accuracy of Mixtral using RAG based on different corpora (document collections).
  • Figure 5: Accuracy of Mixtral across euro-groups, based on the coalitions formed in the 9th European Parliament (2019-2024).