On Your Mark, Get Set, Predict! Modeling Continuous-Time Dynamics of Cascades for Information Popularity Prediction

Xin Jing; Yichen Jing; Yuhuan Lu; Bangchao Deng; Sikun Yang; Dingqi Yang

On Your Mark, Get Set, Predict! Modeling Continuous-Time Dynamics of Cascades for Information Popularity Prediction

Xin Jing, Yichen Jing, Yuhuan Lu, Bangchao Deng, Sikun Yang, Dingqi Yang

TL;DR

ConCat is proposed, modeling the Continuous-time dynamics of Cascades for information popularity prediction, which leverages neural Ordinary Differential Equations (ODEs) to model irregular events of a cascade in continuous time based on the cascade graph and sequential event information.

Abstract

Information popularity prediction is important yet challenging in various domains, including viral marketing and news recommendations. The key to accurately predicting information popularity lies in subtly modeling the underlying temporal information diffusion process behind observed events of an information cascade, such as the retweets of a tweet. To this end, most existing methods either adopt recurrent networks to capture the temporal dynamics from the first to the last observed event or develop a statistical model based on self-exciting point processes to make predictions. However, information diffusion is intrinsically a complex continuous-time process with irregularly observed discrete events, which is oversimplified using recurrent networks as they fail to capture the irregular time intervals between events, or using self-exciting point processes as they lack flexibility to capture the complex diffusion process. Against this background, we propose ConCat, modeling the Continuous-time dynamics of Cascades for information popularity prediction. On the one hand, it leverages neural Ordinary Differential Equations (ODEs) to model irregular events of a cascade in continuous time based on the cascade graph and sequential event information. On the other hand, it considers cascade events as neural temporal point processes (TPPs) parameterized by a conditional intensity function which can also benefit the popularity prediction task. We conduct extensive experiments to evaluate ConCat on three real-world datasets. Results show that ConCat achieves superior performance compared to state-of-the-art baselines, yielding a 2.3%-33.2% improvement over the best-performing baselines across the three datasets.

On Your Mark, Get Set, Predict! Modeling Continuous-Time Dynamics of Cascades for Information Popularity Prediction

TL;DR

Abstract

Paper Structure (25 sections, 22 equations, 5 figures, 9 tables)

This paper contains 25 sections, 22 equations, 5 figures, 9 tables.

Introduction
Preliminaries
Problem Definition
Neural Ordinary Differential Equations
Neural Temporal Point Processes
ConCat
Structural Learning
Cascade Graph Learning
Global Graph Learning
Temporal Dynamics Modeling
Modeling Sequential Information with Self-Attention
Modeling Continuous Dynamics with Neural ODEs
Global Trend Modeling
Prediction
Experiments
...and 10 more sections

Figures (5)

Figure 1: A toy example of a retweet cascade graph (upside) & Distribution of the irregular time interval between the last node and the observation time of 1 hour and 3 years in Weibo dataset and APS dataset correspondingly (below)
Figure 2: An overview of our proposed model ConCat: (1) The input is a cascade graph $\mathbf{G}$ and the global graph $\mathcal{G}$ for a given observation time and we user GraphWave and NetSMF to model them repsectively, getting the representation $E_c(u_i)$ and $E_g(u_i)$ of each node. (2) We use two self-attention modules to get the sequential information of the cascade graph and global graph, concatenate them to a new vector $\left\{s_{0},s_{1},...\right\}$ and leverage neural ODEs to model the continuous-time dynamics, taking $\left\{s_{0},s_{1},...\right\}$ as the jump condition in GRU. (3) We use the hidden state $h_{t_i}$ to parameterize the TPP and compute the integral $\Lambda_{t_{s}}$. (4) We feed the final hidden state $h_{t_s}$ and $\Lambda_{t_{s}}$ into a MLP for prediciton.
Figure 3: CDF and PDF of the observed triplets numbers (left) & Distribution of popularity (right)
Figure 4: Impact of the number of triplets. We set the number of triplets to 100, 200, 400, 800, and 1000 on three datasets, and we compare ConCat with the top two baselines CasFlow and CTCP.
Figure 5: Impact of the hidden dimension of ${h}_{t}$ for first 100 triplets.

On Your Mark, Get Set, Predict! Modeling Continuous-Time Dynamics of Cascades for Information Popularity Prediction

TL;DR

Abstract

On Your Mark, Get Set, Predict! Modeling Continuous-Time Dynamics of Cascades for Information Popularity Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (5)