On Context-aware Detection of Cherry-picking in News Reporting
Israa Jaradat, Haiqi Zhang, Chengkai Li
TL;DR
This work tackles the problem of detecting cherry-picking in news by identifying omitted important statements through cross-narrative context. It formalizes the task as $c_i = \{\exists s \in S_e: f(s,d)=1\} - d_i$ and develops a context-aware framework combining fine-tuned transformers, zero-/few-shot prompting of LLMs, and unsupervised baselines, evaluated on a novel Cherry dataset with 3,346 examples. The main findings show a best F1 of about 0.89 and accuracy around 0.90, with Longformer-large reaching 0.897 accuracy and 0.887 F1 at 500 words of context; results indicate that contextual information from multiple narratives improves detection and that biases in outlets moderately influence measured cherry-picking. The work contributes a publicly released dataset and demonstrates a scalable approach for auditing omission bias in news reporting, with implications for media credibility assessment and automated fact-checking pipelines.
Abstract
Cherry-picking refers to the deliberate selection of evidence or facts that favor a particular viewpoint while ignoring or distorting evidence that supports an opposing perspective. Manually identifying cherry-picked statements in news stories can be challenging. In this study, we introduce a novel approach to detecting cherry-picked statements by identifying missing important statements in a target news story using language models and contextual information from other news sources. Furthermore, this research introduces a novel dataset specifically designed for training and evaluating cherry-picking detection models. Our best performing model achieves an F-1 score of about 89% in detecting important statements. Moreover, results show the effectiveness of incorporating external knowledge from alternative narratives when assessing statement importance.
