Table of Contents
Fetching ...

Comparative Insights from 12 Machine Learning Models in Extracting Economic Ideology from Political Text

Jihed Ncib

TL;DR

The results show that generative models such as GPT-4o and Gemini 1.5 Flash consistently outperform other models against all benchmarks, however, they pose issues of accessibility and resource availability and their dependency on training data severely limits scalability.

Abstract

This study conducts a systematic assessment of the capabilities of 12 machine learning models and model variations in detecting economic ideology. As an evaluation benchmark, I use manifesto data spanning six elections in the United Kingdom and pre-annotated by expert and crowd coders. The analysis assesses the performance of several generative, fine-tuned, and zero-shot models at the granular and aggregate levels. The results show that generative models such as GPT-4o and Gemini 1.5 Flash consistently outperform other models against all benchmarks. However, they pose issues of accessibility and resource availability. Fine-tuning yielded competitive performance and offers a reliable alternative through domain-specific optimization. But its dependency on training data severely limits scalability. Zero-shot models consistently face difficulties with identifying signals of economic ideology, often resulting in negative associations with human coding. Using general knowledge for the domain-specific task of ideology scaling proved to be unreliable. Other key findings include considerable within-party variation, fine-tuning benefiting from larger training data, and zero-shot's sensitivity to prompt content. The assessments include the strengths and limitations of each model and derive best-practices for automated analyses of political content.

Comparative Insights from 12 Machine Learning Models in Extracting Economic Ideology from Political Text

TL;DR

The results show that generative models such as GPT-4o and Gemini 1.5 Flash consistently outperform other models against all benchmarks, however, they pose issues of accessibility and resource availability and their dependency on training data severely limits scalability.

Abstract

This study conducts a systematic assessment of the capabilities of 12 machine learning models and model variations in detecting economic ideology. As an evaluation benchmark, I use manifesto data spanning six elections in the United Kingdom and pre-annotated by expert and crowd coders. The analysis assesses the performance of several generative, fine-tuned, and zero-shot models at the granular and aggregate levels. The results show that generative models such as GPT-4o and Gemini 1.5 Flash consistently outperform other models against all benchmarks. However, they pose issues of accessibility and resource availability. Fine-tuning yielded competitive performance and offers a reliable alternative through domain-specific optimization. But its dependency on training data severely limits scalability. Zero-shot models consistently face difficulties with identifying signals of economic ideology, often resulting in negative associations with human coding. Using general knowledge for the domain-specific task of ideology scaling proved to be unreliable. Other key findings include considerable within-party variation, fine-tuning benefiting from larger training data, and zero-shot's sensitivity to prompt content. The assessments include the strengths and limitations of each model and derive best-practices for automated analyses of political content.
Paper Structure (14 sections, 1 equation, 20 figures, 15 tables)

This paper contains 14 sections, 1 equation, 20 figures, 15 tables.

Figures (20)

  • Figure 1: Number of sentences by manifesto and election year.
  • Figure 2: Average economic ideological score by manifesto based on crowd and expert codings. Negative values indicate a more left position. Positive values indicate a more right position.
  • Figure 3: Aggregate manifesto-level correlation coefficients by model - expert coding.
  • Figure 4: Performance metrics of different generative models.
  • Figure 5: Aggregate manifesto-level correlation coefficients of generative models - expert coding.
  • ...and 15 more figures