SMTPD: A New Benchmark for Temporal Prediction of Social Media Popularity
Yijie Xu, Bolun Zheng, Wei Zhu, Hangjia Pan, Yuchen Yao, Ning Xu, Anan Liu, Quan Zhang, Chenggang Yan
TL;DR
SMTPD introduces a large-scale, multilingual, multi-modal benchmark for temporal popularity prediction on YouTube with aligned 30-day popularity sequences. The authors propose a baseline framework that combines visual (ResNet-101), textual (BERT-Multilingual), numerical, and categorical features, fused into a temporal regression model based on an LSTM and trained with a Composite Gradient Loss. Experimental results show that temporal alignment and early popularity signals substantially improve prediction accuracy across languages, outperforming SMPD baselines in temporal forecasting. The dataset and baseline enable cross-language, time-aligned analysis of social media popularity and support development of more effective prediction models for content optimization and digital marketing.
Abstract
Social media popularity prediction task aims to predict the popularity of posts on social media platforms, which has a positive driving effect on application scenarios such as content optimization, digital marketing and online advertising. Though many studies have made significant progress, few of them pay much attention to the integration between popularity prediction with temporal alignment. In this paper, with exploring YouTube's multilingual and multi-modal content, we construct a new social media temporal popularity prediction benchmark, namely SMTPD, and suggest a baseline framework for temporal popularity prediction. Through data analysis and experiments, we verify that temporal alignment and early popularity play crucial roles in social media popularity prediction for not only deepening the understanding of temporal dynamics of popularity in social media but also offering a suggestion about developing more effective prediction models in this field. Code is available at https://github.com/zhuwei321/SMTPD.
