Quantifying Generative Media Bias with a Corpus of Real-world and Generated News Articles

Filip Trhlik; Pontus Stenetorp

Quantifying Generative Media Bias with a Corpus of Real-world and Generated News Articles

Filip Trhlik, Pontus Stenetorp

TL;DR

This paper addresses the challenge of measuring political bias in LLM-generated journalism by constructing a paired dataset of 2,100 human-authored news articles and 56,700 AI-generated articles from nine LLMs, derived from journalist summaries via the NEWSROOM source. It deploys both supervised (RoBERTa-large, POLITICS) and LLM-based classifiers (CARP prompts) to quantify bias shifts, revealing that instruction-tuned LLMs display pronounced left-wing bias and that biases persist when LLMs are used as classifiers. The authors introduce a rigorous framework for bias quantification, including definitions of political alignment via $PA_{article} = P_{right} - P_{left}$ and shifts $\,\Delta PA$, along with careful data selection and quality controls (Self-BLEU, regeneration). The findings underscore potential risks to journalism from biased generation and classification, emphasize the need for careful model selection and prompting, and provide a publicly available dataset to spur further research into mitigating LLM political bias in media contexts.

Abstract

Large language models (LLMs) are increasingly being utilised across a range of tasks and domains, with a burgeoning interest in their application within the field of journalism. This trend raises concerns due to our limited understanding of LLM behaviour in this domain, especially with respect to political bias. Existing studies predominantly focus on LLMs undertaking political questionnaires, which offers only limited insights into their biases and operational nuances. To address this gap, our study establishes a new curated dataset that contains 2,100 human-written articles and utilises their descriptions to generate 56,700 synthetic articles using nine LLMs. This enables us to analyse shifts in properties between human-authored and machine-generated articles, with this study focusing on political bias, detecting it using both supervised models and LLMs. Our findings reveal significant disparities between base and instruction-tuned LLMs, with instruction-tuned models exhibiting consistent political bias. Furthermore, we are able to study how LLMs behave as classifiers, observing their display of political bias even in this role. Overall, for the first time within the journalistic domain, this study outlines a framework and provides a structured dataset for quantifiable experiments, serving as a foundation for further research into LLM political bias and its implications.

Quantifying Generative Media Bias with a Corpus of Real-world and Generated News Articles

TL;DR

and shifts

, along with careful data selection and quality controls (Self-BLEU, regeneration). The findings underscore potential risks to journalism from biased generation and classification, emphasize the need for careful model selection and prompting, and provide a publicly available dataset to spur further research into mitigating LLM political bias in media contexts.

Abstract

Paper Structure (56 sections, 3 equations, 10 figures, 6 tables)

This paper contains 56 sections, 3 equations, 10 figures, 6 tables.

Introduction
Related Work
Political Bias Assessment
Classification Through LLMs
Political Bias in LLMs
Task & Data
Task Specification
Categorisation of News Articles
Data Selection
Article Length
Summary Length
Summary Metrics
Article Generation
Prompts
Model Selection
...and 41 more sections

Figures (10)

Figure 1: Political shift per model and prompt type (as assessed by supervised models)
Figure 2: Political shift per model and prompt type (as assessed by LLMs)
Figure 3: LLM classification bias, with the y-axis denoting the specific model in Equation \ref{['deltaX']} and the x-axis the results of the $C_{Bias}(i)$ calculations for two inputs
Figure 4: Most frequent unique words per category
Figure 5: News source distribution in the final dataset
...and 5 more figures

Quantifying Generative Media Bias with a Corpus of Real-world and Generated News Articles

TL;DR

Abstract

Quantifying Generative Media Bias with a Corpus of Real-world and Generated News Articles

Authors

TL;DR

Abstract

Table of Contents

Figures (10)