Table of Contents
Fetching ...

Large Engagement Networks for Classifying Coordinated Campaigns and Organic Twitter Trends

Atul Anand Gopalakrishnan, Jakir Hossain, Tugrulcan Elmas, Ahmet Erdem Sariyuce

TL;DR

This work addresses the challenge of distinguishing coordinated campaigns from organic Twitter trends by introducing LEN, a large-scale benchmark of engagement networks derived from Turkish Twitter data (179 campaigns, 135 non-campaigns; ~11K nodes and ~23K edges per graph on average). Campaign ground truth is obtained via ephemeral astroturfing, where bots post lexicon-based tweets to push topics to trends and then delete them, providing strong signals for graph-level classification. The authors benchmark multiple graph neural networks (e.g., GCN, GAT, GIN, GraphSAGE, GINE) and non-neural baselines (VNGE, LSD) on three tasks: binary campaign vs non-campaign, multiclass campaign-type classification, and a news-focused binary split, highlighting the difficulty of large-scale graphs and the impact of label imbalance. LEN serves as a challenging, real-world benchmark that can drive development of scalable graph classification methods and improve understanding of coordinated campaigns versus organic trends on social media.

Abstract

Social media users and inauthentic accounts, such as bots, may coordinate in promoting their topics. Such topics may give the impression that they are organically popular among the public, even though they are astroturfing campaigns that are centrally managed. It is challenging to predict if a topic is organic or a coordinated campaign due to the lack of reliable ground truth. In this paper, we create such ground truth by detecting the campaigns promoted by ephemeral astroturfing attacks. These attacks push any topic to Twitter's (X) trends list by employing bots that tweet in a coordinated manner in a short period and then immediately delete their tweets. We manually curate a dataset of organic Twitter trends. We then create engagement networks out of these datasets which can serve as a challenging testbed for graph classification task to distinguish between campaigns and organic trends. Engagement networks consist of users as nodes and engagements as edges (retweets, replies, and quotes) between users. We release the engagement networks for 179 campaigns and 135 non-campaigns, and also provide finer-grain labels to characterize the type of the campaigns and non-campaigns. Our dataset, LEN (Large Engagement Networks), is available in the URL below. In comparison to traditional graph classification datasets, which are small with tens of nodes and hundreds of edges at most, graphs in LEN are larger. The average graph in LEN has ~11K nodes and ~23K edges. We show that state-of-the-art GNN methods give only mediocre results for campaign vs. non-campaign and campaign type classification on LEN. LEN offers a unique and challenging playfield for the graph classification problem. We believe that LEN will help advance the frontiers of graph classification techniques on large networks and also provide an interesting use case in terms of distinguishing coordinated campaigns and organic trends.

Large Engagement Networks for Classifying Coordinated Campaigns and Organic Twitter Trends

TL;DR

This work addresses the challenge of distinguishing coordinated campaigns from organic Twitter trends by introducing LEN, a large-scale benchmark of engagement networks derived from Turkish Twitter data (179 campaigns, 135 non-campaigns; ~11K nodes and ~23K edges per graph on average). Campaign ground truth is obtained via ephemeral astroturfing, where bots post lexicon-based tweets to push topics to trends and then delete them, providing strong signals for graph-level classification. The authors benchmark multiple graph neural networks (e.g., GCN, GAT, GIN, GraphSAGE, GINE) and non-neural baselines (VNGE, LSD) on three tasks: binary campaign vs non-campaign, multiclass campaign-type classification, and a news-focused binary split, highlighting the difficulty of large-scale graphs and the impact of label imbalance. LEN serves as a challenging, real-world benchmark that can drive development of scalable graph classification methods and improve understanding of coordinated campaigns versus organic trends on social media.

Abstract

Social media users and inauthentic accounts, such as bots, may coordinate in promoting their topics. Such topics may give the impression that they are organically popular among the public, even though they are astroturfing campaigns that are centrally managed. It is challenging to predict if a topic is organic or a coordinated campaign due to the lack of reliable ground truth. In this paper, we create such ground truth by detecting the campaigns promoted by ephemeral astroturfing attacks. These attacks push any topic to Twitter's (X) trends list by employing bots that tweet in a coordinated manner in a short period and then immediately delete their tweets. We manually curate a dataset of organic Twitter trends. We then create engagement networks out of these datasets which can serve as a challenging testbed for graph classification task to distinguish between campaigns and organic trends. Engagement networks consist of users as nodes and engagements as edges (retweets, replies, and quotes) between users. We release the engagement networks for 179 campaigns and 135 non-campaigns, and also provide finer-grain labels to characterize the type of the campaigns and non-campaigns. Our dataset, LEN (Large Engagement Networks), is available in the URL below. In comparison to traditional graph classification datasets, which are small with tens of nodes and hundreds of edges at most, graphs in LEN are larger. The average graph in LEN has ~11K nodes and ~23K edges. We show that state-of-the-art GNN methods give only mediocre results for campaign vs. non-campaign and campaign type classification on LEN. LEN offers a unique and challenging playfield for the graph classification problem. We believe that LEN will help advance the frontiers of graph classification techniques on large networks and also provide an interesting use case in terms of distinguishing coordinated campaigns and organic trends.

Paper Structure

This paper contains 19 sections, 5 figures, 7 tables.

Figures (5)

  • Figure 1: (Left) Randomly generated (lexicon) tweets from bots promoting the hashtag #HeartBridgeCoin. (Right) It becomes trending in 6 countries and globally for the first and the last time.
  • Figure 2: Training runtime (in seconds) vs graph size.
  • Figure 3: Confusion matrices to display the performance of the graph classifiers.
  • Figure 4: Receiver Operating Characteristic (ROC) curves for campaign vs. non-campaign classification across the small dataset.
  • Figure 5: Receiver Operating Characteristic (ROC) curves for campaign vs. non-campaign classification across the complete dataset.