Table of Contents
Fetching ...

Revealing COVID-19's Social Dynamics: Diachronic Semantic Analysis of Vaccine and Symptom Discourse on Twitter

Zeqiang Wang, Jiageng Wu, Yuqi Wang, Wei Wang, Jie Yang, Jon Johnson, Nishanth Sastry, Suparna De

TL;DR

This paper proposes an unsupervised dynamic word embedding method to capture longitudinal semantic shifts in social media data without predefined anchor words, and reveals semantic evolution patterns of vaccine- and symptom-related entities across different pandemic stages, and their potential correlations with real-world statistics.

Abstract

Social media is recognized as an important source for deriving insights into public opinion dynamics and social impacts due to the vast textual data generated daily and the 'unconstrained' behavior of people interacting on these platforms. However, such analyses prove challenging due to the semantic shift phenomenon, where word meanings evolve over time. This paper proposes an unsupervised dynamic word embedding method to capture longitudinal semantic shifts in social media data without predefined anchor words. The method leverages word co-occurrence statistics and dynamic updating to adapt embeddings over time, addressing the challenges of data sparseness, imbalanced distributions, and synergistic semantic effects. Evaluated on a large COVID-19 Twitter dataset, the method reveals semantic evolution patterns of vaccine- and symptom-related entities across different pandemic stages, and their potential correlations with real-world statistics. Our key contributions include the dynamic embedding technique, empirical analysis of COVID-19 semantic shifts, and discussions on enhancing semantic shift modeling for computational social science research. This study enables capturing longitudinal semantic dynamics on social media to understand public discourse and collective phenomena.

Revealing COVID-19's Social Dynamics: Diachronic Semantic Analysis of Vaccine and Symptom Discourse on Twitter

TL;DR

This paper proposes an unsupervised dynamic word embedding method to capture longitudinal semantic shifts in social media data without predefined anchor words, and reveals semantic evolution patterns of vaccine- and symptom-related entities across different pandemic stages, and their potential correlations with real-world statistics.

Abstract

Social media is recognized as an important source for deriving insights into public opinion dynamics and social impacts due to the vast textual data generated daily and the 'unconstrained' behavior of people interacting on these platforms. However, such analyses prove challenging due to the semantic shift phenomenon, where word meanings evolve over time. This paper proposes an unsupervised dynamic word embedding method to capture longitudinal semantic shifts in social media data without predefined anchor words. The method leverages word co-occurrence statistics and dynamic updating to adapt embeddings over time, addressing the challenges of data sparseness, imbalanced distributions, and synergistic semantic effects. Evaluated on a large COVID-19 Twitter dataset, the method reveals semantic evolution patterns of vaccine- and symptom-related entities across different pandemic stages, and their potential correlations with real-world statistics. Our key contributions include the dynamic embedding technique, empirical analysis of COVID-19 semantic shifts, and discussions on enhancing semantic shift modeling for computational social science research. This study enables capturing longitudinal semantic dynamics on social media to understand public discourse and collective phenomena.

Paper Structure

This paper contains 14 sections, 6 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overall framework of the proposed unsupervised dynamic word embedding method. (1) Co-occurrence Analysis: Word co-occurrence matrices from a diachronic corpus are computed and normalized; (2) Adaptive Selection of Time Slices: Adjacent time slices with high word co-occurrence similarity are merged adaptively; (3) Dynamic Word Embeddings Update: Word embeddings are dynamically updated based on the current and previous time slices. (4) Semantic Shift Detection and Analysis: Word semantic shifts are detected by embedding similarity and tracking the associations of word pairs that change over time.
  • Figure 2: Psychological symptom semantic shift trajectories across different periods. Lines and dots of different colors represent the semantic shift trajectories of various symptoms. Our framework automatically identifies three significant stages of semantic shift, which correspond to the three phases of the pandemic as compared to the number of new COVID-19 cases reported by the WHO. (a) shows the semantic relationships of these symptoms at different stages of the pandemic. (b) visually presents the distribution of symptoms in semantic space across different periods, illustrated using gradients of text and color. Each point in the projection graph represents the semantic position of a symptom at a specific period. By analyzing the trajectories, we observe a converging trend as the global spread of the pandemic progresses, indicating increasing semantic association.
  • Figure 3: Dynamic longitudinal analysis of COVID-19 symptom-vaccine semantic associations. (a) and (b) show the semantic correlations during the early outbreak (Feb 2020-Sep 2020) and the global pandemic (Jun 2021-Apr 2022), respectively. By analyzing the changes in the semantic associations of symptoms-symptoms and symptoms-vaccine across different periods, we examined the potential combination patterns of symptoms and the sensitivity of symptoms to vaccine protection.
  • Figure 4: COVID-19 timeline.