Table of Contents
Fetching ...

On mission Twitter Profiles: A Study of Selective Toxic Behavior

Hina Qayyum, Muhammad Ikram, Benjamin Zi Hao Zhao, an D. Wood, Nicolas Kourtellis, Mohamed Ali Kaafar

TL;DR

The paper defines on-mission profiles as instrumented Twitter accounts that distribute toxicity toward specific themes via diverse topics to appear legitimate. It presents a large-scale, label-free methodology combining longitudinal data, contextualized topic modeling with BERTweet, and toxicity signals to identify thematic diversity and potential on-mission behavior, achieving a 100% accuracy on a held-out test set. A 96-profile on-mission subset within the most diverse group (Group-VIII) is characterized in depth, revealing shared topics, toxicity patterns, and social connectivity that differentiate them from non-on-mission peers. A detector using 38 features across content, auxiliary content, activity, and profile metadata demonstrates strong performance (F1 up to 100%, accuracy up to 100%) and is evaluated in the wild, suggesting practical utility for flagging evolving influence operations while noting ethical and deployment considerations.

Abstract

The argument for persistent social media influence campaigns, often funded by malicious entities, is gaining traction. These entities utilize instrumented profiles to disseminate divisive content and disinformation, shaping public perception. Despite ample evidence of these instrumented profiles, few identification methods exist to locate them in the wild. To evade detection and appear genuine, small clusters of instrumented profiles engage in unrelated discussions, diverting attention from their true goals. This strategic thematic diversity conceals their selective polarity towards certain topics and fosters public trust. This study aims to characterize profiles potentially used for influence operations, termed 'on-mission profiles,' relying solely on thematic content diversity within unlabeled data. Distinguishing this work is its focus on content volume and toxicity towards specific themes. Longitudinal data from 138K Twitter or X, profiles and 293M tweets enables profiling based on theme diversity. High thematic diversity groups predominantly produce toxic content concerning specific themes, like politics, health, and news classifying them as 'on-mission' profiles. Using the identified ``on-mission" profiles, we design a classifier for unseen, unlabeled data. Employing a linear SVM model, we train and test it on an 80/20% split of the most diverse profiles. The classifier achieves a flawless 100% accuracy, facilitating the discovery of previously unknown ``on-mission" profiles in the wild.

On mission Twitter Profiles: A Study of Selective Toxic Behavior

TL;DR

The paper defines on-mission profiles as instrumented Twitter accounts that distribute toxicity toward specific themes via diverse topics to appear legitimate. It presents a large-scale, label-free methodology combining longitudinal data, contextualized topic modeling with BERTweet, and toxicity signals to identify thematic diversity and potential on-mission behavior, achieving a 100% accuracy on a held-out test set. A 96-profile on-mission subset within the most diverse group (Group-VIII) is characterized in depth, revealing shared topics, toxicity patterns, and social connectivity that differentiate them from non-on-mission peers. A detector using 38 features across content, auxiliary content, activity, and profile metadata demonstrates strong performance (F1 up to 100%, accuracy up to 100%) and is evaluated in the wild, suggesting practical utility for flagging evolving influence operations while noting ethical and deployment considerations.

Abstract

The argument for persistent social media influence campaigns, often funded by malicious entities, is gaining traction. These entities utilize instrumented profiles to disseminate divisive content and disinformation, shaping public perception. Despite ample evidence of these instrumented profiles, few identification methods exist to locate them in the wild. To evade detection and appear genuine, small clusters of instrumented profiles engage in unrelated discussions, diverting attention from their true goals. This strategic thematic diversity conceals their selective polarity towards certain topics and fosters public trust. This study aims to characterize profiles potentially used for influence operations, termed 'on-mission profiles,' relying solely on thematic content diversity within unlabeled data. Distinguishing this work is its focus on content volume and toxicity towards specific themes. Longitudinal data from 138K Twitter or X, profiles and 293M tweets enables profiling based on theme diversity. High thematic diversity groups predominantly produce toxic content concerning specific themes, like politics, health, and news classifying them as 'on-mission' profiles. Using the identified ``on-mission" profiles, we design a classifier for unseen, unlabeled data. Employing a linear SVM model, we train and test it on an 80/20% split of the most diverse profiles. The classifier achieves a flawless 100% accuracy, facilitating the discovery of previously unknown ``on-mission" profiles in the wild.
Paper Structure (41 sections, 3 equations, 9 figures, 7 tables)

This paper contains 41 sections, 3 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Entropy (H) of Category Probability Vector (CPV) per Twitter profile, organized into groups based on the number of categories identified in the tweets of these profiles (Section \ref{['sec:Thematic diversity']}). Each interval represents a distinct group of profiles.
  • Figure 2: Boxplots illustrating Perspective toxicity scores (\ref{['fig:measurement_plots/TOXICITY_median_score_box_groups.pdf']}) and the consistency of these toxic scores (\ref{['fig:measurement_plots/TOXICITY_gini_score_box_groups.pdf']}) for profiles within groups II-VIII (Section \ref{['sec:Topic toxicity']}).
  • Figure 3: Cumulative distribution function of the number of total (\ref{['fig:measurement_plots/no_tweets_per_profile.pdf']}) and unique tweets (\ref{['fig:measurement_plots/no_unique_tweets_per_profile.pdf']}) per profile in groups II-VIII (Section \ref{['sec:Prolificacy']}).
  • Figure 4: (\ref{['fig:measurement_plots/total_hashtags_per_group.pdf']}) Cumulative distributions of the number of total hashtags; (\ref{['fig:measurement_plots/unique_hashtags_per_group.pdf']}) unique hashtags and; (\ref{['fig:measurement_plots/ratio_hashtags_to_ttw_all_groups.pdf']}) the ratio of hashtags to total the number of tweets per profile in group-II to VIII (Section \ref{['sec:Hashtags']}).
  • Figure 5: Probability density function (\ref{['fig:measurement_plots/normalised_burstiness_pdf.pdf']}) and cumulative distribution function; (\ref{['fig:measurement_plots/normalised_burstiness_cdf.pdf']}) Plots of normalized burstiness score per profile in groups II-VIII (Section \ref{['sec:Burstiness']}).
  • ...and 4 more figures