Tracking the Takes and Trajectories of English-Language News Narratives across Trustworthy and Worrisome Websites
Hans W. A. Hanley, Emily Okabe, Zakir Durumeric
TL;DR
The paper presents a scalable system to map how English-language news narratives travel across trustworthy and worrisome websites by embedding passages with encoder-based LLMs, clustering story clusters with DP-Means, and inferring inter-site relationships via NETINF, augmented with zero-shot stance detection. It demonstrates that reliable outlets significantly influence the topics and narratives across the ecosystem, while unreliable sites contribute distinctive stances and can seed propaganda networks on topics like Ukraine and vaccines. The approach yields a near-global perspective on the English-language news landscape, enabling journalists and fact-checkers to prioritize narratives for verification and to identify influential sources and coordination networks. The authors also provide an open-source release of weights, code, and crawled URLs to support reproducibility and further research in misinformation and propaganda analytics.
Abstract
Understanding how misleading and outright false information enters news ecosystems remains a difficult challenge that requires tracking how narratives spread across thousands of fringe and mainstream news websites. To do this, we introduce a system that utilizes encoder-based large language models and zero-shot stance detection to scalably identify and track news narratives and their attitudes across over 4,000 factually unreliable, mixed-reliability, and factually reliable English-language news websites. Running our system over an 18 month period, we track the spread of 146K news stories. Using network-based interference via the NETINF algorithm, we show that the paths of news narratives and the stances of websites toward particular entities can be used to uncover slanted propaganda networks (e.g., anti-vaccine and anti-Ukraine) and to identify the most influential websites in spreading these attitudes in the broader news ecosystem. We hope that increased visibility into our distributed news ecosystem can help with the reporting and fact-checking of propaganda and disinformation.
