Table of Contents
Fetching ...

Structure and Context of Retweet Coordination in the 2022 U.S. Midterm Elections

David Axelrod, John Paolillo

TL;DR

This study tackles the challenge of distinguishing coordinated activity from organic, motivated behavior in social media during the 2022 U.S. midterm elections. It introduces a latent sharing-space framework combined with a $k$-nearest-neighbor association metric and $\phi$-based edges to identify coordination candidates without relying on arbitrary similarity thresholds. By applying Singular Value Decomposition to a binarized retweeter–tweet matrix and clustering in the latent space, the authors uncover four user clusters with distinct themes, including music-awards promotion (Cluster A) and political mobilization (Clusters B–D). The findings highlight two primary coordination motifs and demonstrate how latent structure can reveal shared motivations across apparently fragmented groups, while also emphasizing careful interpretation to avoid misclassifying benign, self-organized activity as coordination.

Abstract

The ability to detect coordinated activity in communication networks is an ongoing challenge. Prior approaches emphasize considering any activity exceeding a specific threshold of similarity to be coordinated. However, identifying such a threshold is often arbitrary and can be difficult to distinguish from grassroots organized behavior. In this paper, we investigate a set of Twitter retweeting data collected around the 2022 US midterm elections, using a latent sharing-space model, in which we identify the main components of an association network, thresholded with a k-nearest neighbor criterion. This approach identifies a distribution of association values with different roles in the network at different ranges, where the shape of the distribution suggests a natural place to threshold for coordinated user candidates. We find coordination candidates belonging to two broad categories, one involving music awards and promotion of Korean pop or Taylor Swift, the other being users engaged in political mobilization. In addition, the latent space suggests common motivations for different coordinated groups otherwise fragmented by using an appropriately high threshold criterion for coordination.

Structure and Context of Retweet Coordination in the 2022 U.S. Midterm Elections

TL;DR

This study tackles the challenge of distinguishing coordinated activity from organic, motivated behavior in social media during the 2022 U.S. midterm elections. It introduces a latent sharing-space framework combined with a -nearest-neighbor association metric and -based edges to identify coordination candidates without relying on arbitrary similarity thresholds. By applying Singular Value Decomposition to a binarized retweeter–tweet matrix and clustering in the latent space, the authors uncover four user clusters with distinct themes, including music-awards promotion (Cluster A) and political mobilization (Clusters B–D). The findings highlight two primary coordination motifs and demonstrate how latent structure can reveal shared motivations across apparently fragmented groups, while also emphasizing careful interpretation to avoid misclassifying benign, self-organized activity as coordination.

Abstract

The ability to detect coordinated activity in communication networks is an ongoing challenge. Prior approaches emphasize considering any activity exceeding a specific threshold of similarity to be coordinated. However, identifying such a threshold is often arbitrary and can be difficult to distinguish from grassroots organized behavior. In this paper, we investigate a set of Twitter retweeting data collected around the 2022 US midterm elections, using a latent sharing-space model, in which we identify the main components of an association network, thresholded with a k-nearest neighbor criterion. This approach identifies a distribution of association values with different roles in the network at different ranges, where the shape of the distribution suggests a natural place to threshold for coordinated user candidates. We find coordination candidates belonging to two broad categories, one involving music awards and promotion of Korean pop or Taylor Swift, the other being users engaged in political mobilization. In addition, the latent space suggests common motivations for different coordinated groups otherwise fragmented by using an appropriately high threshold criterion for coordination.
Paper Structure (14 sections, 2 equations, 8 figures, 1 table)

This paper contains 14 sections, 2 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Frequencies of edge weights between pairs of users for which $\phi$ was measured. Yellow region corresponds to values of $\phi$ the we treat as an indicator of coordination. Black and yellow region correspond to critical $\phi$ values derived for a 0.0001 significance level and Bonferroni corrected by the number of user comparisons made (109,118). Using statistical significance as a threshold mechanism would incorrectly suggest almost all our users are coordinated. Instead, we utilize the bimodality of the distribution as seen here and in Figure \ref{['fig:phi_thresh']}.
  • Figure 2: Log ratio of edges in the network to edges in the largest connected component for increasing threshold of $\phi$ ($x$-axis). As the threshold is increased, edges falling below the threshold are removed. Two peaks are observed with increasing trends corresponding to threshold values where the largest connected component aggressively decomposes into smaller components. Point colors encode the edge counts on log scale, as indicated in the key.
  • Figure 3: Association network thresholded at three $\phi$ values. Our chosen threshold is represented by the last plot, $\phi \geq 0.67$.
  • Figure 4: User clusters based on retweeting behavior within the latent sharing space. Left-hand plots include all users; right-hand plots are coordination candidates with $\phi \geq 0.67$. Cluster memberships are from Figure \ref{['fig:size_clust']}. Density plots for each are on the respective diagonals.
  • Figure 5: Bar plot of cluster sizes. Shaded areas correspond to the number of coordinated users found in each respective cluster.
  • ...and 3 more figures