LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains

Raphael Hernandes; Giulio Corsi

LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains

Raphael Hernandes, Giulio Corsi

TL;DR

This paper suggests that while GPT-4 can be a scalable, cost-effective tool for political bias classification of news websites, its use should be as a complement to human judgment to mitigate biases.

Abstract

This research investigates whether OpenAI's GPT-4, a state-of-the-art large language model, can accurately classify the political bias of news sources based solely on their URLs. Given the subjective nature of political labels, third-party bias ratings like those from Ad Fontes Media, AllSides, and Media Bias/Fact Check (MBFC) are often used in research to analyze news source diversity. This study aims to determine if GPT-4 can replicate these human ratings on a seven-degree scale ("far-left" to "far-right"). The analysis compares GPT-4's classifications against MBFC's, and controls for website popularity using Open PageRank scores. Findings reveal a high correlation ($\text{Spearman's } ρ= .89$, $n = 5,877$, $p < 0.001$) between GPT-4's and MBFC's ratings, indicating the model's potential reliability. However, GPT-4 abstained from classifying approximately $\frac{2}{3}$ of the dataset. It is more likely to abstain from rating unpopular websites, which also suffer from less accurate assessments. The LLM tends to avoid classifying sources that MBFC considers to be centrist, resulting in more polarized outputs. Finally, this analysis shows a slight leftward skew in GPT's classifications compared to MBFC's. Therefore, while this paper suggests that while GPT-4 can be a scalable, cost-effective tool for political bias classification of news websites, its use should be as a complement to human judgment to mitigate biases.

LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains

TL;DR

This paper suggests that while GPT-4 can be a scalable, cost-effective tool for political bias classification of news websites, its use should be as a complement to human judgment to mitigate biases.

Abstract

) between GPT-4's and MBFC's ratings, indicating the model's potential reliability. However, GPT-4 abstained from classifying approximately

of the dataset. It is more likely to abstain from rating unpopular websites, which also suffer from less accurate assessments. The LLM tends to avoid classifying sources that MBFC considers to be centrist, resulting in more polarized outputs. Finally, this analysis shows a slight leftward skew in GPT's classifications compared to MBFC's. Therefore, while this paper suggests that while GPT-4 can be a scalable, cost-effective tool for political bias classification of news websites, its use should be as a complement to human judgment to mitigate biases.

Paper Structure (24 sections, 6 figures, 9 tables)

This paper contains 24 sections, 6 figures, 9 tables.

Background
GPT-4 and LLMs
LLMs for Annotation Tasks
LLMs in Political Analysis
Rating News Sources' Political Bias
Applications of these Ratings
Methodology
Open PageRank
Prompting GPT-4
Results
Correlation Between GPT-4 and Human-Assigned Political Bias Ratings
Left-leaning Bias
Impact of Popularity on Accuracy
Unassigned Observations
Discussion
...and 9 more sections

Figures (6)

Figure 1: Distribution of news sources in absolute (left) and relative values (right).
Figure 2: Histograms of difference (top) and absolute difference (bottom) between GPT and MBFC ratings show concentration around minimal difference; charts on the right exclude zero for easier visualization.
Figure 3: The ROC curves of GPT's ratings, using MBFC as a baseline, binarized into biased and unbiased.
Figure 4: Heatmap of news sources classifications (left) shows that most sources fall within the expected axis (the colorful diagonal); heatmap of sources' popularity (right) indicates that popular sources converge towards the center.
Figure 5: Histogram of Open PageRank score.
...and 1 more figures

LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains

TL;DR

Abstract

LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains

Authors

TL;DR

Abstract

Table of Contents

Figures (6)