Positioning Political Texts with Large Language Models by Asking and Averaging

Gaël Le Mens; Aina Gallego

Positioning Political Texts with Large Language Models by Asking and Averaging

Gaël Le Mens, Aina Gallego

TL;DR

This work uses instruction-tuned Large Language Models like GPT-4, Llama 3, MiXtral, or Aya to position political texts within policy and ideological spaces and finds that this approach is generally more accurate than the positions obtained with supervised classifiers trained on large amounts of research data.

Abstract

We use instruction-tuned Large Language Models (LLMs) like GPT-4, Llama 3, MiXtral, or Aya to position political texts within policy and ideological spaces. We ask an LLM where a tweet or a sentence of a political text stands on the focal dimension and take the average of the LLM responses to position political actors such as US Senators, or longer texts such as UK party manifestos or EU policy speeches given in 10 different languages. The correlations between the position estimates obtained with the best LLMs and benchmarks based on text coding by experts, crowdworkers, or roll call votes exceed .90. This approach is generally more accurate than the positions obtained with supervised classifiers trained on large amounts of research data. Using instruction-tuned LLMs to position texts in policy and ideological spaces is fast, cost-efficient, reliable, and reproducible (in the case of open LLMs) even if the texts are short and written in different languages. We conclude with cautionary notes about the need for empirical validation.

Positioning Political Texts with Large Language Models by Asking and Averaging

TL;DR

Abstract

Paper Structure (67 sections, 25 figures, 4 tables)

This paper contains 67 sections, 25 figures, 4 tables.

Introduction
Methods and data
Obtaining position estimates with LLMs
Data
Tweets published by US Congress members after the training cut-off date of GPT-4
Senators of the 117th Congress
British party manifestos
Multilingual setting: EU policy speeches in 10 languages
Results
Tweets published by members of the US Congress after the training cut-off date of GPT-4
Senators of the 117th US Congress
British party manifestos
Multilingual setting: EU policy speeches in 10 Languages
Discussion
Data
...and 52 more sections

Figures (25)

Figure 1: Positioning tweets published by members of the US Congress on the left-right ideological spectrum ($N=899$).
Figure 2: Positioning Senators of the 117th Congress on the left-right ideological spectrum based on a random sample of 100 of their tweets ($N=98$). Each dot represents a senator (' +': Democrats, ' x': Republicans, ' o': others).
Figure 3: Positioning British party manifestos on the Economic policy dimension (left to right wing scale). The numbers next to the dots indicate the years of the manifestos.
Figure 4: Positioning British party manifestos on the Social policy dimension (liberal to conservative scale). The numbers next to the dots indicate the years of the manifestos.
Figure 5: Positioning EU legislative speeches in 10 languages on the 'anti-subsidy’ to 'pro-subsidy’ dimension.
...and 20 more figures

Positioning Political Texts with Large Language Models by Asking and Averaging

TL;DR

Abstract

Positioning Political Texts with Large Language Models by Asking and Averaging

Authors

TL;DR

Abstract

Table of Contents

Figures (25)