What Drives Online Popularity: Author, Content or Sharers? Estimating Spread Dynamics with Bayesian Mixture Hawkes

Pio Calderon; Marian-Andrei Rizoiu

What Drives Online Popularity: Author, Content or Sharers? Estimating Spread Dynamics with Bayesian Mixture Hawkes

Pio Calderon, Marian-Andrei Rizoiu

TL;DR

The paper tackles predicting online content diffusion by jointly modeling source, content, and cascade factors. It introduces the Bayesian Mixture Hawkes ($BMH$), a two-level hierarchical mixture of separable Hawkes processes with a popularity submodel ($BMH-P$) and a kernel submodel ($BMH-K$) that link publisher- and item-level features to diffusion dynamics. Across two Twitter retweet datasets divided into controversial and reputable publishers, $BMH$ outperforms state-of-the-art baselines in cold-start popularity prediction and temporal profile generalization, while enabling counterfactual analysis of headline styles. The approach reveals nuanced publisher-specific responses to headline styles (e.g., clickbait and inflammatory content) and highlights the distinct roles of initiating users in controversial versus reputable outlets. This framework offers a principled, feature-aware, probabilistic tool for understanding and forecasting diffusion with practical implications for content strategy and misinformation studies.

Abstract

The spread of content on social media is shaped by intertwining factors on three levels: the source, the content itself, and the pathways of content spread. At the lowest level, the popularity of the sharing user determines its eventual reach. However, higher-level factors such as the nature of the online item and the credibility of its source also play crucial roles in determining how widely and rapidly the online item spreads. In this work, we propose the Bayesian Mixture Hawkes (BMH) model to jointly learn the influence of source, content and spread. We formulate the BMH model as a hierarchical mixture model of separable Hawkes processes, accommodating different classes of Hawkes dynamics and the influence of feature sets on these classes. We test the BMH model on two learning tasks, cold-start popularity prediction and temporal profile generalization performance, applying to two real-world retweet cascade datasets referencing articles from controversial and traditional media publishers. The BMH model outperforms the state-of-the-art models and predictive baselines on both datasets and utilizes cascade- and item-level information better than the alternatives. Lastly, we perform a counter-factual analysis where we apply the trained publisher-level BMH models to a set of article headlines and show that effectiveness of headline writing style (neutral, clickbait, inflammatory) varies across publishers. The BMH model unveils differences in style effectiveness between controversial and reputable publishers, where we find clickbait to be notably more effective for reputable publishers as opposed to controversial ones, which links to the latter's overuse of clickbait.

What Drives Online Popularity: Author, Content or Sharers? Estimating Spread Dynamics with Bayesian Mixture Hawkes

TL;DR

The paper tackles predicting online content diffusion by jointly modeling source, content, and cascade factors. It introduces the Bayesian Mixture Hawkes (

), a two-level hierarchical mixture of separable Hawkes processes with a popularity submodel (

) and a kernel submodel (

) that link publisher- and item-level features to diffusion dynamics. Across two Twitter retweet datasets divided into controversial and reputable publishers,

outperforms state-of-the-art baselines in cold-start popularity prediction and temporal profile generalization, while enabling counterfactual analysis of headline styles. The approach reveals nuanced publisher-specific responses to headline styles (e.g., clickbait and inflammatory content) and highlights the distinct roles of initiating users in controversial versus reputable outlets. This framework offers a principled, feature-aware, probabilistic tool for understanding and forecasting diffusion with practical implications for content strategy and misinformation studies.

Abstract

Paper Structure (22 sections, 10 equations, 5 figures, 3 tables)

This paper contains 22 sections, 10 equations, 5 figures, 3 tables.

Introduction
Related Work.
Preliminaries
Hawkes Process
Dual Mixture Model
Bayesian Mixture Hawkes (BMH) Model
BMH-P, the Popularity Submodel
Inference and Prediction.
BMH-K, the Kernel Submodel
Inference and Prediction.
Predictive Evaluation
Datasets
Cold-Start Popularity Prediction
Results.
Temporal Profile Generalization Performance
...and 7 more sections

Figures (5)

Figure 1: An intuitive plate diagram for the BMH model. Left: The BMH model is trained using a historical dataset: a collection of $M$ publishers $\{\rho_1, \ldots, \rho_M\}$, items for each publisher (i.e. articles), and a set of diffusion cascades for each item. Each diffusion cascade consists of a timeline of events, here represented by a set of lollipops. Upper Right: The BMH is a publisher-level model that maps cascade features (shown in blue color) and article features (in red color) to a mixture of Hawkes processes. Lower Right: The trained BMH model (with the historical follower count distribution) can be used to infer spread dynamics of future articles based on their headlines.
Figure 2: Plate diagram of the BMH-P model. Shaded nodes are observables while empty nodes are latent variables. Paired colored edges indicate source nodes appearing as a product in the target node. For instance, the green edges indicate that $\overrightarrow{\gamma}_{\alpha,k}$ and $\overrightarrow{y}^a$ appear as $\overrightarrow{\gamma}_{\alpha,k} \cdot \overrightarrow{y}^a$ in the expression for $\alpha^{ac}$ in \ref{['eq:logit_a']}. The same concept holds for the blue and red edges. Edges marked with * indicate dependence of the target node on the source node indexed with $k$ and the entire set $\{1, \cdots, K_{\alpha}\}$. For instance, in \ref{['eq:z_ac']}$z^{ac}_{{\alpha},k}$ depends on $\overrightarrow{\beta}^a_{z_{{\alpha}, k}}$ (see the numerator) and $\overrightarrow{\beta}^a_{z_{{\alpha}, k'}}$ for $k' \in \{1, \cdots, K_{\alpha}\}$ (see the denominator).
Figure 3: Plate diagram of the BMH-K model. Shaded nodes are observables while empty nodes are latent variables. Paired colored edges indicate source nodes appearing as a product in the target node. For instance, the green edges indicate that $\overrightarrow{\gamma}_{\theta,k}$ and $\overrightarrow{y}^a$ appear as the product $\overrightarrow{\gamma}_{\theta,k} \cdot \overrightarrow{y}^a$ in the expression for $\theta^{ac}$ in \ref{['eq:logit_t1']}. The same concept holds for the blue and red edges. Edges marked with * indicate dependence of the target node on the source node indexed with $k$ and the entire set $\{1, \cdots, K_{\boldsymbol{\Theta}}\}$. For instance, in \ref{['eq:z_ac_theta']}$z^{ac}_{{\boldsymbol{\Theta}},k}$ depends on $\overrightarrow{\beta}^a_{z_{{\boldsymbol{\Theta}}, k}}$ (see the numerator) and $\overrightarrow{\beta}^a_{z_{{\boldsymbol{\Theta}}, k'}}$ for $k' \in \{1, \cdots, K_{\boldsymbol{\Theta}}\}$ (see the denominator).
Figure 4: Predictive performance for (a) CNIX and (b) RNIX. The dots indicate the median and the error bars give the $25^{th}/75^{th}$ quantiles. We compare the BMH with the DMM Kong2020, EB Tan2021, cascade-size (CR) models, and the joint HP.
Figure 5: (a) Distribution of predicted half-life $\log \hat{\tau}_{1/2}^a$ vs. cascade size $\log \hat{N}^{a}$ for each article in $HEADLINES$ using the news.com.au BMH model. (b and c) Probability that an article performs better than the publisher average, for each headline style across $CNIX$ and $RNIX$: (b) cascade size $\hat{N}^{a}$; (c) half life $\hat{\tau}_{1/2}^a$.

What Drives Online Popularity: Author, Content or Sharers? Estimating Spread Dynamics with Bayesian Mixture Hawkes

TL;DR

Abstract

What Drives Online Popularity: Author, Content or Sharers? Estimating Spread Dynamics with Bayesian Mixture Hawkes

Authors

TL;DR

Abstract

Table of Contents

Figures (5)