Table of Contents
Fetching ...

"Whose Side Are You On?" Estimating Ideology of Political and News Content Using Large Language Models and Few-shot Demonstration Selection

Muhammad Haroon, Magdalena Wojcieszak, Anshuman Chhabra

TL;DR

This work investigates estimating political ideology of online content using large language models with few-shot in-context learning. It introduces a Set-BSR demonstration selection method to create balanced, high-coverage prompts, and evaluates across YouTube Slant, Ad Fontes, and AllSides datasets using GPT-4o, Llama-2-13B, and Mistral-7B. The results show that few-shot ICL with balanced demonstrations substantially improves accuracy over zero-shot and traditional supervised baselines, with metadata such as source information providing notable gains, while certain modalities (e.g., thumbnails) and chain-of-thought prompts offer limited or negative benefits. The findings demonstrate GPT-4o’s ability to approach human-level consistency on several benchmarks and highlight practical implications and ethical considerations for scalable, cross-domain ideology classification in news and video content.

Abstract

The rapid growth of social media platforms has led to concerns about radicalization, filter bubbles, and content bias. Existing approaches to classifying ideology are limited in that they require extensive human effort, the labeling of large datasets, and are not able to adapt to evolving ideological contexts. This paper explores the potential of Large Language Models (LLMs) for classifying the political ideology of online content through in-context learning (ICL). Our extensive experiments involving demonstration selection in label-balanced fashion, conducted on three datasets comprising news articles and YouTube videos, reveal that our approach significantly outperforms zero-shot and traditional supervised methods. Additionally, we evaluate the influence of metadata (e.g., content source and descriptions) on ideological classification and discuss its implications. Finally, we show how providing the source for political and non-political content influences the LLM's classification.

"Whose Side Are You On?" Estimating Ideology of Political and News Content Using Large Language Models and Few-shot Demonstration Selection

TL;DR

This work investigates estimating political ideology of online content using large language models with few-shot in-context learning. It introduces a Set-BSR demonstration selection method to create balanced, high-coverage prompts, and evaluates across YouTube Slant, Ad Fontes, and AllSides datasets using GPT-4o, Llama-2-13B, and Mistral-7B. The results show that few-shot ICL with balanced demonstrations substantially improves accuracy over zero-shot and traditional supervised baselines, with metadata such as source information providing notable gains, while certain modalities (e.g., thumbnails) and chain-of-thought prompts offer limited or negative benefits. The findings demonstrate GPT-4o’s ability to approach human-level consistency on several benchmarks and highlight practical implications and ethical considerations for scalable, cross-domain ideology classification in news and video content.

Abstract

The rapid growth of social media platforms has led to concerns about radicalization, filter bubbles, and content bias. Existing approaches to classifying ideology are limited in that they require extensive human effort, the labeling of large datasets, and are not able to adapt to evolving ideological contexts. This paper explores the potential of Large Language Models (LLMs) for classifying the political ideology of online content through in-context learning (ICL). Our extensive experiments involving demonstration selection in label-balanced fashion, conducted on three datasets comprising news articles and YouTube videos, reveal that our approach significantly outperforms zero-shot and traditional supervised methods. Additionally, we evaluate the influence of metadata (e.g., content source and descriptions) on ideological classification and discuss its implications. Finally, we show how providing the source for political and non-political content influences the LLM's classification.

Paper Structure

This paper contains 32 sections, 5 figures, 19 tables, 1 algorithm.

Figures (5)

  • Figure 1: Improvement in accuracies of the GPT-4o, Llama2, and Mistral models across the three datasets by increasing the number of ICL demonstrations $k$. The baseline MLP and RoBERTa models are also shown. We see that a higher number of demonstrations leads to better predictions, and that the models generally outperform the baselines (even though the RoBERTa and MLP models have access to the full training set, and the LLMs only see $k$ demonstrations in-context). The error bars show a $95\%$ confidence interval.
  • Figure 2: Improvement in the accuracies of the GPT-4o, Llama2, and Mistral models by testing different combinations of the title, source, and description from the three datasets in the zero-shot setting. We see that providing source and description immensely improves the accuracy of the models across the Ad Fontes and AllSides news datasets where the news sources are more likely to be already known by the model. The YouTube dataset, on the other hand, sees a degradation in performance albeit a slight improvement in the case of Mistral. The error bars show a 95% confidence interval.
  • Figure 3: Heatmap showing changes in the performance of GPT-4o when the channel name is provided for a) political news, b) non-political news, and c) political non-news YouTube videos. Positive values along the main diagonal represent an increase in accuracy. For example, +16% in the top-left cell means providing the channel name increased the accuracy of classifying the Liberal class from 54% to 70%. We see more accurate prediction (increase in top-left and bottom-right) of news content (\ref{['fig:pol-news-cm']}, \ref{['fig:non-pol-news-cm']}) when the channel name is provided showing how the LLM is already familiar with the partisan leanings of online news channels.
  • Figure 4: Increases in accuracy across all datasets by increasing the number of ICL shots.
  • Figure 5: Placement of channels based on the slant cut-offs for the YouTube dataset.