Rhythm of Opinion: A Hawkes-Graph Framework for Dynamic Propagation Analysis
Yulong Li, Zhixiang Lu, Feilong Tang, Simin Lai, Ming Hu, Yuxuan Zhang, Haochen Xue, Zhaodong Wu, Imran Razzak, Qingxia Li, Jionglong Su
TL;DR
This work tackles the challenge of dynamic public opinion propagation on social media, where temporal evolution, hierarchical comment structures, and cross-topic influences interact in complex ways. It introduces a high-dimensional Hawkes process integrated with Graph Neural Networks to jointly model when comments arrive, how sentiment diffuses across hierarchical levels, and how topics influence one another, with the dimension $\omega=(l,c)$ and intensity $\lambda_{\omega}(t)$. The authors release the VISTA dataset, comprising 159 trending topics, 47,207 posts, 327,015 second-level comments, and 29,578 third-level comments, annotated with 11 sentiment categories to enable interpretable analysis of sentiment diffusion within and across topics. The proposed framework yields interpretable predictions of temporal and structural propagation and provides a robust baseline for future studies on large-scale, multi-topic opinion dynamics.
Abstract
The rapid development of social media has significantly reshaped the dynamics of public opinion, resulting in complex interactions that traditional models fail to effectively capture. To address this challenge, we propose an innovative approach that integrates multi-dimensional Hawkes processes with Graph Neural Network, modeling opinion propagation dynamics among nodes in a social network while considering the intricate hierarchical relationships between comments. The extended multi-dimensional Hawkes process captures the hierarchical structure, multi-dimensional interactions, and mutual influences across different topics, forming a complex propagation network. Moreover, recognizing the lack of high-quality datasets capable of comprehensively capturing the evolution of public opinion dynamics, we introduce a new dataset, VISTA. It includes 159 trending topics, corresponding to 47,207 posts, 327,015 second-level comments, and 29,578 third-level comments, covering diverse domains such as politics, entertainment, sports, health, and medicine. The dataset is annotated with detailed sentiment labels across 11 categories and clearly defined hierarchical relationships. When combined with our method, it offers strong interpretability by linking sentiment propagation to the comment hierarchy and temporal evolution. Our approach provides a robust baseline for future research.
