Table of Contents
Fetching ...

Locating privileged spreaders on an Online Social Network

Javier Borge-Holthoefer, Alejandro Rivero, Yamir Moreno

TL;DR

The paper addresses how information diffuses through online social networks and who can trigger system-wide cascades, using data from the 15M movement to connect topology with diffusion dynamics. It combines time-resolved activity cascades with k-core decomposition to compare degree and coreness as predictors of spreading capacity across slow-growth and bursty periods. Key findings show that higher coreness and higher degree seeds tend to generate larger cascades, but bursts of activity reduce topological discrimination, making influential spreaders harder to identify. The work has practical implications for targeted information campaigns and invites empirical validation of diffusion models, emphasizing the role of time evolution in understanding real-world Cascades.

Abstract

Social media have provided plentiful evidence of their capacity for information diffusion. Fads and rumors, but also social unrest and riots travel fast and affect large fractions of the population participating in online social networks (OSNs). This has spurred much research regarding the mechanisms that underlie social contagion, and also who (if any) can unleash system-wide information dissemination. Access to real data, both regarding topology --the network of friendships-- and dynamics --the actual way in which OSNs users interact--, is crucial to decipher how the former facilitates the latter's success, understood as efficiency in information spreading. With the quantitative analysis that stems from complex network theory, we discuss who (and why) has privileged spreading capabilities when it comes to information diffusion. This is done considering the evolution of an episode of political protest which took place in Spain, spanning one month in 2011.

Locating privileged spreaders on an Online Social Network

TL;DR

The paper addresses how information diffuses through online social networks and who can trigger system-wide cascades, using data from the 15M movement to connect topology with diffusion dynamics. It combines time-resolved activity cascades with k-core decomposition to compare degree and coreness as predictors of spreading capacity across slow-growth and bursty periods. Key findings show that higher coreness and higher degree seeds tend to generate larger cascades, but bursts of activity reduce topological discrimination, making influential spreaders harder to identify. The work has practical implications for targeted information campaigns and invites empirical validation of diffusion models, emphasizing the role of time evolution in understanding real-world Cascades.

Abstract

Social media have provided plentiful evidence of their capacity for information diffusion. Fads and rumors, but also social unrest and riots travel fast and affect large fractions of the population participating in online social networks (OSNs). This has spurred much research regarding the mechanisms that underlie social contagion, and also who (if any) can unleash system-wide information dissemination. Access to real data, both regarding topology --the network of friendships-- and dynamics --the actual way in which OSNs users interact--, is crucial to decipher how the former facilitates the latter's success, understood as efficiency in information spreading. With the quantitative analysis that stems from complex network theory, we discuss who (and why) has privileged spreading capabilities when it comes to information diffusion. This is done considering the evolution of an episode of political protest which took place in Spain, spanning one month in 2011.

Paper Structure

This paper contains 7 sections, 4 figures.

Figures (4)

  • Figure 1: (Color online) Temporal evolution of the activity in the online social network. In green, the proportion of nodes that had shown some activity at a certain time $t$. In yellow, the cumulative proportion of emitted messages as a function of time. Note that the two lines evolve in almost the same way. According to this evolution, we have distinguished two sub-periods: one of them characterized as "slow growth" due to the low activity level and the other one tagged as "explosive" or "bursty" due to the intense information traffic within it.
  • Figure 2: (Color online) The figure illustrates the concept of cascade that is used throughout this article. User 1 emits a message at time $t$, and all of his followers automatically receive it. Thus, they are already counted as part of the cascade (small red circles). One of his followers (user 2, big blue node), driven by the previous message, decides himself to participate at time $t+\Delta t$, posting a message himself. A second set of followers are included in the cascade. Finally, a third node (user 3, big green circle) joins in and spreads the cascade further at time $t+2\Delta t$. A node can not be counted twice, note for example that user 4 is also following node 3. Many nodes remain unaffected, because they are not connected to any of the spreaders. The final size of the cascade is $\frac{N_{c}}{N} = \frac{22}{34}$; the success of the cascade largely depends on the capacity to contact a "leader" or "privileged spreader", i.e., a hub to whom many people listens and who decides to participate. The interesting point, however, is that the number of spreaders needed to attain such success is very low (3), and over 50% of the cascade is triggered by just one of them.
  • Figure 3: (Color online) Upper panels (a,b,c): Cascade size probability distributions for the different periods considered. Lower panels (d,e,f): Probability distributions of spreaders involved in the cascades for the same periods. The exact periods considered in the analyses are indicated at the top of each panel. See the text for further details.
  • Figure 4: (Color online) Left upper panel: average spreading capacity (with respect to the system size) of nodes grouped according to their $k$-core. $\frac{N_{c}}{N}$ grows with coreness, but the explosive period (red squares) evidences a much less clear tendency, with many fluctuations and a lower overall spreading capacity if compared to the slow growth period (black circles). Left lower panel: The same information is showed as a function of the degree. Again, the slow growth period is the best one at predicting the extent of a cascade. Interestingly, average cascades for highest degrees outperform those triggered by highest $k$-core nodes by an order of magnitude. See main text for discussion on this aspect. Right panels show the $k$-core and degree distributions, i.e., how many nodes belong to each class. Note that the highest core contains over 1000 users.