Modeling the Popularity of Events on Web by Sparsity and Mutual-Excitation Guided Graph Neural Network
Jiaxin Deng, Linlin Jia, Junbiao Pang, Qingming Huang
TL;DR
This paper tackles web event popularity prediction by modeling how keywords excite overall popularity and how keywords mutually influence each other. It introduces SMN, a graph neural network that builds a PMI-based keyword graph and learns additive popularity through base, self-excitation, mutual-excitation, and an image-context branch via CLIP; sparsity is used to make keyword contributions interpretable. Empirical results on Hot Events and HEP-PH datasets show that SMN variants outperform state-of-the-art CTR-style models and baseline GNN approaches, with ablations demonstrating the importance of both excitation mechanisms and multimodal fusion. The approach provides a scalable, interpretable framework for understanding why certain events go viral, with practical implications for content recommendation and trend analysis.
Abstract
The content of a webpage described or posted an event in the cyberspace inevitably reflects viewpoints, values and trends of the physical society. Mapping an event on web to the popularity score plays a pivot role to sense the social trends from the cyberspace. However, the complex semantic correspondence between texts and images, as well as the implicit text-image-popularity mapping mechanics pose a significant challenge to this non-trivial task. In this paper, we address this problem from a viewpoint of understanding the interpretable mapping mechanics. Concretely, we organize the keywords from different events into an unified graph. The unified graph facilitates to model the popularity of events via two-level mappings, i.e., the self excitation and the mutual excitation. The self-excitation assumes that each keyword forms the popularity while the mutual-excitation models that two keywords would excite each other to determine the popularity of an event. Specifically, we use Graph Neural Network (GNN) as the backbone to model the self-excitation, the mutual excitation and the context of images into a sparse and deep factor model. Besides, to our best knowledge, we release a challenge web event dataset for the popularity prediction task. The experimental results on three public datasets demonstrate that our method achieves significant improvements and outperforms the state-of-the-art methods. Dataset is publicly available at: https://github.com/pangjunbiao/Hot-events-dataset.
