Table of Contents
Fetching ...

Extracting real social interactions from social media: a debate of COVID-19 policies in Mexico

Alberto García-Rodríguez, Tzipe Govezensky, Carlos Gershenson, Gerardo G. Naumis, Rafael A. Barrio

TL;DR

The study tackles distinguishing real social interactions from bot-driven or non-human activity in Twitter discussions around COVID-19 policy in Mexico. It analyzes a directed retweet network and a time-windowed co-retweet network derived from ~3 weeks of data, using degree distributions, directed clustering, and Louvain community detection to uncover polarization and interaction structure. Key findings include power-law degree distributions with a crossover, a small ~2% fitness-subset where clustering decays as $C_j \sim \frac{C_0}{k_{in}^{\gamma}}$ ($\gamma \approx 1.297$) indicating feedback, and the identification of superspreaders/bots dominating the co-retweet topology, suggesting a practical path to separate real interactions from non-human activity. The work provides a framework for quantifying real social dynamics in polarized online discourse and offers actionable insights for monitoring bot-driven amplification in health-policy discussions.

Abstract

A study of the dynamical formation of networks of friends and enemies in social media, in this case Twitter, is presented. We characterise the single node properties of such networks, as the clustering coefficient and the degree, to investigate the structure of links. The results indicate that the network is made from three kinds of nodes: one with high clustering coefficient but very small degree, a second group has zero clustering coefficient with variable degree, and finally, a third group in which the clustering coefficient as a function of the degree decays as a power law. This third group represents $\sim2\%$ of the nodes and is characteristic of dynamical networks with feedback. This part of the lattice seemingly represents strongly interacting friends in a real social network.

Extracting real social interactions from social media: a debate of COVID-19 policies in Mexico

TL;DR

The study tackles distinguishing real social interactions from bot-driven or non-human activity in Twitter discussions around COVID-19 policy in Mexico. It analyzes a directed retweet network and a time-windowed co-retweet network derived from ~3 weeks of data, using degree distributions, directed clustering, and Louvain community detection to uncover polarization and interaction structure. Key findings include power-law degree distributions with a crossover, a small ~2% fitness-subset where clustering decays as () indicating feedback, and the identification of superspreaders/bots dominating the co-retweet topology, suggesting a practical path to separate real interactions from non-human activity. The work provides a framework for quantifying real social dynamics in polarized online discourse and offers actionable insights for monitoring bot-driven amplification in health-policy discussions.

Abstract

A study of the dynamical formation of networks of friends and enemies in social media, in this case Twitter, is presented. We characterise the single node properties of such networks, as the clustering coefficient and the degree, to investigate the structure of links. The results indicate that the network is made from three kinds of nodes: one with high clustering coefficient but very small degree, a second group has zero clustering coefficient with variable degree, and finally, a third group in which the clustering coefficient as a function of the degree decays as a power law. This third group represents of the nodes and is characteristic of dynamical networks with feedback. This part of the lattice seemingly represents strongly interacting friends in a real social network.

Paper Structure

This paper contains 8 sections, 4 equations, 9 figures.

Figures (9)

  • Figure 1: Snapshot of a twitter discussion. Edges between nodes represent retweets or mentions to another user. We pay particular attention to the blue color subset which shows those users who connect with other users more than $8$ times. Yellow nodes have more than $5$ edges.
  • Figure 2: (A) Communities identified in political discussions shown with different colors. The communities detected using the Louvain method Blondel_2008. Observe that most of the nodes at the periphery have degree one, and that the network is polarized into groups with opposite points of view about the subject, in this case COVID related issues. The links are curved and if they go clockwise they are links that come out and vice versa. (B) A zoom of the network showing the two main groups.
  • Figure 3: Log-log plot of the user popularity (blue) and number of users they support (red) distributions. The data sets in light colors correspond to raw data in a linear binning, while the points in dark colors are the results of a logarithmic binning. The solid lines correspond to power law fits, with exponents detailed in the inset. Observe how the support to other users decreases less sharply and is more scattered than the popularity. However, a crossover is seen near $k=10$. The displacement of $k$ by one is made to plot nodes in the logarithmic binning.
  • Figure 4: Snapshot of a retweet network over a period of approximately 3 weeks in the year $2020$, with a focus on the clustering coefficient. We associate a color with the clustering value of each node. Links are colored according to the color of the origin node. Red was assigned for clustering 0.0, blue for clustering 0.5. For this network, approximately 77% of users have a clustering coefficient of 0.0. The size is determined by the indegree. The clockwise links are outgoing and vice versa.
  • Figure 5: Clustering coefficient versus the in-degree of the node in the retweet network. We normalized $k_{in}$ and eliminated all nodes with clustering equal to zero because these nodes mostly represent the simplest dynamics in the network: users generating a retweet without some kind of prior coordination. These users represent 99.7% of accounts. We show in red the users that we have detected have coincided at least three times by placing a hashtag or retweeting the same message with other users, these users represent 1.6% and in blue the rest. The inset shows a zoom of the data, indicating how the network is separated into three kinds of nodes: the ones that almost fall on each axis, and a third class that does not follow such tendency and are similar to lattices obtained with dynamical feedback.
  • ...and 4 more figures