Decoding News Bias: Multi Bias Detection in News Articles
Bhushan Santosh Shah, Deven Santosh Shah, Vahida Attar
TL;DR
This work tackles the problem of detecting multiple biases in news articles by constructing a domain-diverse, multiclass/multilabel bias dataset via LLM-guided annotation and evaluating several transformer-based classifiers. It builds 9,790 full-text articles across six domains, annotates 4,886 samples into seven bias categories using GPT-4o mini prompts, and benchmarks BERT, RoBERTa, ALBERT, DistilBERT, and XLNet with multilabel stratified splits. The findings indicate that BERT generally delivers the strongest performance (notably Political Bias with F1 ≈ 0.89), but substantial challenges remain due to class imbalance and potential LLM misannotations. The study provides a foundation for broad-spectrum bias detection in news, highlighting practical opportunities and avenues for improving labeling reliability and data diversity in future work.
Abstract
News Articles provides crucial information about various events happening in the society but they unfortunately come with different kind of biases. These biases can significantly distort public opinion and trust in the media, making it essential to develop techniques to detect and address them. Previous works have majorly worked towards identifying biases in particular domains e.g., Political, gender biases. However, more comprehensive studies are needed to detect biases across diverse domains. Large language models (LLMs) offer a powerful way to analyze and understand natural language, making them ideal for constructing datasets and detecting these biases. In this work, we have explored various biases present in the news articles, built a dataset using LLMs and present results obtained using multiple detection techniques. Our approach highlights the importance of broad-spectrum bias detection and offers new insights for improving the integrity of news articles.
