Table of Contents
Fetching ...

A Social Dynamical System for Twitter Analysis

Zhiping Xiao, Xinyu Wang, Yifang Qin, Zijie Huang, Mason A. Porter, Yizhou Sun

TL;DR

This work presents LSDS, a Latent Social Dynamical System that learns latent opinion trajectories from textual posts in online social networks. It encodes pre-start posts into latent states, evolves them with a Graph Neural ODE, and decodes to predict interactions and polarity, evaluated on Twitter-derived data. The model demonstrates that encoder quality dominates performance, enables flexible integration of various opinion-dynamics ODEs, and provides strong long-term predictive capabilities, despite data limitations. The approach offers a scalable framework for studying real-world opinion evolution with applications in policy, marketing, and information diffusion, and aims to release datasets and code for reproducibility.

Abstract

Understanding the evolution of public opinion is crucial for informed decision-making in various domains, particularly public affairs. The rapid growth of social networks, such as Twitter (now rebranded as X), provides an unprecedented opportunity to analyze public opinion at scale without relying on traditional surveys. With the rise of deep learning, Graph Neural Networks (GNNs) have shown great promise in modeling online opinion dynamics. Notably, classical opinion dynamics models, such as DeGroot, can be reformulated within a GNN framework. We introduce Latent Social Dynamical System (LSDS), a novel framework for modeling the latent dynamics of social media users' opinions based on textual content. Since expressed opinions may not fully reflect underlying beliefs, LSDS first encodes post content into latent representations. It then leverages a GraphODE framework, using a GNN-based ODE function to predict future opinions. A decoder subsequently utilizes these predicted latent opinions to perform downstream tasks, such as interaction prediction, which serve as benchmarks for model evaluation. Our framework is highly flexible, supporting various opinion dynamic models as ODE functions, provided they can be adapted into a GNN-based form. It also accommodates different encoder architectures and is compatible with diverse downstream tasks. To validate our approach, we constructed dynamic datasets from Twitter data. Experimental results demonstrate the effectiveness of LSDS, highlighting its potential for future applications. We plan to publicly release our dataset and code upon the publication of this paper.

A Social Dynamical System for Twitter Analysis

TL;DR

This work presents LSDS, a Latent Social Dynamical System that learns latent opinion trajectories from textual posts in online social networks. It encodes pre-start posts into latent states, evolves them with a Graph Neural ODE, and decodes to predict interactions and polarity, evaluated on Twitter-derived data. The model demonstrates that encoder quality dominates performance, enables flexible integration of various opinion-dynamics ODEs, and provides strong long-term predictive capabilities, despite data limitations. The approach offers a scalable framework for studying real-world opinion evolution with applications in policy, marketing, and information diffusion, and aims to release datasets and code for reproducibility.

Abstract

Understanding the evolution of public opinion is crucial for informed decision-making in various domains, particularly public affairs. The rapid growth of social networks, such as Twitter (now rebranded as X), provides an unprecedented opportunity to analyze public opinion at scale without relying on traditional surveys. With the rise of deep learning, Graph Neural Networks (GNNs) have shown great promise in modeling online opinion dynamics. Notably, classical opinion dynamics models, such as DeGroot, can be reformulated within a GNN framework. We introduce Latent Social Dynamical System (LSDS), a novel framework for modeling the latent dynamics of social media users' opinions based on textual content. Since expressed opinions may not fully reflect underlying beliefs, LSDS first encodes post content into latent representations. It then leverages a GraphODE framework, using a GNN-based ODE function to predict future opinions. A decoder subsequently utilizes these predicted latent opinions to perform downstream tasks, such as interaction prediction, which serve as benchmarks for model evaluation. Our framework is highly flexible, supporting various opinion dynamic models as ODE functions, provided they can be adapted into a GNN-based form. It also accommodates different encoder architectures and is compatible with diverse downstream tasks. To validate our approach, we constructed dynamic datasets from Twitter data. Experimental results demonstrate the effectiveness of LSDS, highlighting its potential for future applications. We plan to publicly release our dataset and code upon the publication of this paper.

Paper Structure

This paper contains 36 sections, 30 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Illustration of opinion dynamics in a network. Nodes represent individuals, and edges denote interactions. Colors indicate observed opinions along a liberal--conservative spectrum, with darker shades representing more extreme opinions. The darkest blue and darkest red correspond to the most extreme positions. White nodes signify instances where opinions are unobserved at the given time.
  • Figure 2: A schematic overview of our model's architecture. We implement the graph ODE function $g_i$ as a GNN function, where all edges in the network depicts the follower--followee relationships in the original graph (i.e., the graph on the bottom-left corner of this Figure).
  • Figure 3: The number of interactions observed in our data set throughout year 2020, from the first week to the last. The average number of total replies per day in each week consistently remains below $1$. The solid line represents the mean values, while the shaded areas indicate the standard deviation.
  • Figure 4: The number of tweets included in our data set throughout year 2020, from the first week to the last. There are significantly fewer observations in the first and last weeks because these weeks are incomplete, meaning not all days are fully observed. The solid line represents the mean values, while the shaded areas indicate the standard deviation.
  • Figure 5: The mean Fuzzy Entropy scores of each account's text embedding values of all dimensions from Sentence-BERTreimers2019sentence embedding, and the mean Fuzzy Entropy scores of the same account's polarity scores from PEMxiao2023detecting.
  • ...and 6 more figures