Table of Contents
Fetching ...

Timescale-agnostic characterisation for collective attention events

Tristan J. B. Cann, Iain S. Weaver, Hywel T. P. Williams

TL;DR

An agent-based model for generating collective attention events is developed and it is revealed that three of these behaviours instead represent a continuum determined by model parameters rather than discrete categories, suggesting that collective attention in social systems develops in line with a set of universal principles independent of effects inherent to system scale.

Abstract

Online communications, and in particular social media, are a key component of how society interacts with and promotes content online. Collective attention on such content can vary wildly. The majority of breaking topics quickly fade into obscurity after only a handful of interactions, while the possibility exists for content to ``go viral'', seeing sustained interaction by large audiences over long periods. In this paper we investigate the mechanisms behind such events and introduce a new representation that enables direct comparison of events over diverse time and volume scales. We find four characteristic behaviours in the usage of hashtags on Twitter that are indicative of different patterns of attention to topics. We go on to develop an agent-based model for generating collective attention events to test the factors affecting emergence of these phenomena. This model can reproduce the characteristic behaviours seen in the Twitter dataset using a small set of parameters, and reveal that three of these behaviours instead represent a continuum determined by model parameters rather than discrete categories. These insights suggest that collective attention in social systems develops in line with a set of universal principles independent of effects inherent to system scale, and the techniques we introduce here present a valuable opportunity to infer the possible mechanisms of attention flow in online communications.

Timescale-agnostic characterisation for collective attention events

TL;DR

An agent-based model for generating collective attention events is developed and it is revealed that three of these behaviours instead represent a continuum determined by model parameters rather than discrete categories, suggesting that collective attention in social systems develops in line with a set of universal principles independent of effects inherent to system scale.

Abstract

Online communications, and in particular social media, are a key component of how society interacts with and promotes content online. Collective attention on such content can vary wildly. The majority of breaking topics quickly fade into obscurity after only a handful of interactions, while the possibility exists for content to ``go viral'', seeing sustained interaction by large audiences over long periods. In this paper we investigate the mechanisms behind such events and introduce a new representation that enables direct comparison of events over diverse time and volume scales. We find four characteristic behaviours in the usage of hashtags on Twitter that are indicative of different patterns of attention to topics. We go on to develop an agent-based model for generating collective attention events to test the factors affecting emergence of these phenomena. This model can reproduce the characteristic behaviours seen in the Twitter dataset using a small set of parameters, and reveal that three of these behaviours instead represent a continuum determined by model parameters rather than discrete categories. These insights suggest that collective attention in social systems develops in line with a set of universal principles independent of effects inherent to system scale, and the techniques we introduce here present a valuable opportunity to infer the possible mechanisms of attention flow in online communications.

Paper Structure

This paper contains 20 sections, 6 equations, 14 figures, 3 tables, 2 algorithms.

Figures (14)

  • Figure 1: Different hashtags can show similar trends when viewed at different temporal scales (defined by bin width). To compare these directly with each other careful normalisation across volume also needs to be applied. Here it is illustrated how similar behaviour of a short-lived peak and gradual decline can manifest on different time scales.
  • Figure 2: In order to resolve the differences in number of tweets between different timeseries segments we apply our scale-independent representation. For reference, the daily binned timeseries is shown in Fig. \ref{['sh:fig:schem_timeseries']}. In Fig. \ref{['sh:fig:schem_cdf_tweet_ids']} we calculate the CDF and find the tweets observed between successive quantiles of interest (red arrows). We use these tweet IDs to produce a vector of the desired length for each interval as shown in Fig. \ref{['sh:fig:schem_vec']}. Here we set $N=10$ for visual clarity, but use $N=50$ for all subsequent analysis.
  • Figure 3: #cpc18, used to signify attendance or interest in the 2018 UK Conservative party conference. Choosing a bin width of one day presents this period as a single event, gradually building up to a peak. At shorter time resolutions, we see that activity rises and falls across each day of the conference. Fig. \ref{['sh:fig:cpc18_d']} shows the scale-independent representation of this period.
  • Figure 4: The four characteristic shapes found around increased hashtag usage rate. Different coloured profiles indicate different events. The dashed line approximates the value of constant activity evenly distributed through the lifetime. a) The right-tailed shape indicates events with a sudden increase in attention followed by a gradual decrease. b) The arch-shaped profile indicates events with both a gradual increase and a gradual decrease in attention. c) The left-tailed shape indicates events with a gradual increase in attention before a sudden decrease. d) The abrupt shift shape indicates a sudden transition in and out of a period of steady, heightened attention.
  • Figure 5: Projection of the scale-independent representations of hashtag intervals using the two dominant components under t-SNE analysis. Point colours denote the type classification by the authors and point size is proportional to the square root of total usage in the interval. Grey, low opacity points indicate periods of activity which did not clearly show any of the four characteristic shapes discussed in Section \ref{['sec:shapes']}. Hashtag and interval number labels are provided for intervals with more than 22,500 tweets, and the label for #brexit applies to all intervals which overlap in the t-SNE projection. This figure excludes hashtags with anomalous behaviour from repeated usage in a single tweet, whose representations were not compatible with t-SNE projection. We see that the left-tailed, arch-shaped and right-tailed profiles approximate frontiers in this projection and suggest a transition of behaviours between activity shapes.
  • ...and 9 more figures