How to Detect Information Voids Using Longitudinal Data from Social Media and Web Searches

Irene Scalco; Francesco Gesualdo; Roy Cerqueti; Matteo Cinelli

How to Detect Information Voids Using Longitudinal Data from Social Media and Web Searches

Irene Scalco, Francesco Gesualdo, Roy Cerqueti, Matteo Cinelli

TL;DR

It is shown that information voids are associated with a higher prevalence of misinformation, thus representing problematic hotspots in which individuals are more likely to be misled by low-quality online content and providing empirical support for the inclusion of information voids in mechanistic explanations of misinformation emergence.

Abstract

The model of the attention economy, where content producers compete for the attention of users, relies on two key forces: information supply and demand. This study leverages the feedback loop between these forces to develop a method for detecting and quantifying information voids, i.e., periods in which little or no reliable information is available on a given topic. Using a case study on COVID-19 vaccines rollout in six European countries, and drawing on data from multiple platforms including Facebook, Google, Twitter, Wikipedia, and online news outlets, we examine how information voids emerge, persist and correlate with a decline in the proportion of high-quality information circulating online. By conceptualising information voids as a specific regime of information spreading, we also quantify their counterpart, information overabundance, which constitute a central component of the current definition of infodemic. We show that information voids are associated with a higher prevalence of misinformation, thus representing problematic hotspots in which individuals are more likely to be misled by low-quality online content. Overall, our findings provide empirical support for the inclusion of information voids in mechanistic explanations of misinformation emergence.

How to Detect Information Voids Using Longitudinal Data from Social Media and Web Searches

TL;DR

Abstract

Paper Structure (34 sections, 6 equations, 16 figures, 10 tables, 2 algorithms)

This paper contains 34 sections, 6 equations, 16 figures, 10 tables, 2 algorithms.

Introduction
Data Science Pipeline
Results and Discussion
Conclusions
Methods
Supplementary Information

Figures (16)

Figure 1: Definition of the information delta regimes.
Figure 2: Pictorial representation of the synthetic data simulation process. a) Supply and demand are generated from Gaussian distributions. b) Anomalies of varying magnitudes are randomly added to the series, generating artificial spikes. c) The information delta is computed. d) The anomaly detection algorithm is applied to the resulting delta time series.
Figure 3: Daily variation of the supply-demand delta computing Facebook supply and Wikipedia demand. Each panel displays trends over time for a specific combination of country (column label) and vaccine (row label). Time series for $\delta_t$ were capped to $\pm$ 10 for visualisation purposes, yet much larger values, i.e. more severe anomalies, occur. The yellow-shaded region highlights the period from December 1, 2020, to February 1, 2021, encompassing the time immediately before and after the introduction of the vaccine. The empty panel corresponds to the Sputnik-V vaccine in Denmark, for which no information is available during the same time span.
Figure 4: a) Distribution of anomalies before and after the vaccine roll-out. The x-axis represents days from the national start of the vaccination campaign (vertical dashed line at day 0). b) Average duration of anomalies by source. Negative anomalies are displayed in blue while positive anomalies in red.
Figure 5: a) Cumulative value of anomalies for the Moderna vaccine (mrna-1273), displayed by source and country. The red line indicates the date of authorization by the EMA. b) Cumulative anomalies related to the AstraZeneca vaccine, broken down by source and country. The red vertical lines mark key dates: 1 March, when the first cases of TTS were being reported and the debate over AstraZeneca’s safety was beginning to intensify across several European states, and 7 April when the EMA Pharmacovigilance Risk Assessment Committee (PRAC) confirmed a causal link with TTS as a very rare adverse event.
...and 11 more figures

How to Detect Information Voids Using Longitudinal Data from Social Media and Web Searches

TL;DR

Abstract

How to Detect Information Voids Using Longitudinal Data from Social Media and Web Searches

Authors

TL;DR

Abstract

Table of Contents

Figures (16)