"Whose Side Are You On?" Estimating Ideology of Political and News Content Using Large Language Models and Few-shot Demonstration Selection
Muhammad Haroon, Magdalena Wojcieszak, Anshuman Chhabra
TL;DR
This work investigates estimating political ideology of online content using large language models with few-shot in-context learning. It introduces a Set-BSR demonstration selection method to create balanced, high-coverage prompts, and evaluates across YouTube Slant, Ad Fontes, and AllSides datasets using GPT-4o, Llama-2-13B, and Mistral-7B. The results show that few-shot ICL with balanced demonstrations substantially improves accuracy over zero-shot and traditional supervised baselines, with metadata such as source information providing notable gains, while certain modalities (e.g., thumbnails) and chain-of-thought prompts offer limited or negative benefits. The findings demonstrate GPT-4o’s ability to approach human-level consistency on several benchmarks and highlight practical implications and ethical considerations for scalable, cross-domain ideology classification in news and video content.
Abstract
The rapid growth of social media platforms has led to concerns about radicalization, filter bubbles, and content bias. Existing approaches to classifying ideology are limited in that they require extensive human effort, the labeling of large datasets, and are not able to adapt to evolving ideological contexts. This paper explores the potential of Large Language Models (LLMs) for classifying the political ideology of online content through in-context learning (ICL). Our extensive experiments involving demonstration selection in label-balanced fashion, conducted on three datasets comprising news articles and YouTube videos, reveal that our approach significantly outperforms zero-shot and traditional supervised methods. Additionally, we evaluate the influence of metadata (e.g., content source and descriptions) on ideological classification and discuss its implications. Finally, we show how providing the source for political and non-political content influences the LLM's classification.
