Investigating Bias in LLM-Based Bias Detection: Disparities between LLMs and Human Perception
Luyang Lin, Lingzhi Wang, Jinsong Guo, Kam-Fai Wong
TL;DR
This work tackles the problem of biases embedded in LLMs when used for media bias detection, distinguishing system biases from content bias and outlining four research questions. It introduces two evaluation perspectives—LLM-based political bias prediction and article continuation—and leverages FlipBias and ABP datasets to quantify biases, including a Bias Tendency Index (BTI). Through prompt-based debiasing and selective fine-tuning, the study shows that LLMs exhibit an overall left-leaning tendency with topic-dependent variation, and that prompt-based strategies can reduce bias with modest trade-offs in accuracy, while finetuning can reduce bias but risk adding new biases. The findings underscore the need for robust debiasing in LLM-powered bias-detection pipelines and highlight cross-model variation, informing the design of fairer AI systems for media analysis.
Abstract
The pervasive spread of misinformation and disinformation in social media underscores the critical importance of detecting media bias. While robust Large Language Models (LLMs) have emerged as foundational tools for bias prediction, concerns about inherent biases within these models persist. In this work, we investigate the presence and nature of bias within LLMs and its consequential impact on media bias detection. Departing from conventional approaches that focus solely on bias detection in media content, we delve into biases within the LLM systems themselves. Through meticulous examination, we probe whether LLMs exhibit biases, particularly in political bias prediction and text continuation tasks. Additionally, we explore bias across diverse topics, aiming to uncover nuanced variations in bias expression within the LLM framework. Importantly, we propose debiasing strategies, including prompt engineering and model fine-tuning. Extensive analysis of bias tendencies across different LLMs sheds light on the broader landscape of bias propagation in language models. This study advances our understanding of LLM bias, offering critical insights into its implications for bias detection tasks and paving the way for more robust and equitable AI systems
