Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on Twitter
Mohammad Hammas Saeed, Shiza Ali, Pujan Paudel, Jeremy Blackburn, Gianluca Stringhini
TL;DR
The paper tackles the problem of state-sponsored disinformation campaigns on Twitter by proposing a campaign-agnostic detection framework that generalizes across unseen campaigns. It identifies universal campaign traits across 19 campaigns, translates them into a four-modal feature set (user attributes, temporal patterns, stylometry, and source information), and trains a Random Forest classifier that achieves up to 98.5% accuracy on balanced data and up to 94% cross-campaign detection accuracy. The authors validate their approach with a large-scale dataset (over $200$ million tweets) and demonstrate real-world applicability by flagging 116 potentially malicious accounts in the wild and presenting case studies aligned with known campaigns. The work highlights resilience to evasion via multi-factor signals and discusses implications for automated safety features on social platforms, while acknowledging limitations such as API access constraints and language biases and outlining avenues for future research.
Abstract
Social media platforms offer unprecedented opportunities for connectivity and exchange of ideas; however, they also serve as fertile grounds for the dissemination of disinformation. Over the years, there has been a rise in state-sponsored campaigns aiming to spread disinformation and sway public opinion on sensitive topics through designated accounts, known as troll accounts. Past works on detecting accounts belonging to state-backed operations focus on a single campaign. While campaign-specific detection techniques are easier to build, there is no work done on developing systems that are campaign-agnostic and offer generalized detection of troll accounts unaffected by the biases of the specific campaign they belong to. In this paper, we identify several strategies adopted across different state actors and present a system that leverages them to detect accounts from previously unseen campaigns. We study 19 state-sponsored disinformation campaigns that took place on Twitter, originating from various countries. The strategies include sending automated messages through popular scheduling services, retweeting and sharing selective content and using fake versions of verified applications for pushing content. By translating these traits into a feature set, we build a machine learning-based classifier that can correctly identify up to 94% of accounts from unseen campaigns. Additionally, we run our system in the wild and find more accounts that could potentially belong to state-backed operations. We also present case studies to highlight the similarity between the accounts found by our system and those identified by Twitter.
