Table of Contents
Fetching ...

Sentiment and Hashtag-aware Attentive Deep Neural Network for Multimodal Post Popularity Prediction

Shubhi Bansal, Mohit Kumar, Chandravardhan Singh Raghaw, Nagendra Kumar

TL;DR

This work tackles multimodal post popularity prediction by introducing NARRATOR, a model that jointly leverages textual and visual content, demographic cues inferred from faces, and downstream signals from hashtags, including sentiment. A novel hashtag-guided attention mechanism directs attention across text and images using hashtag context, while visual demographics and hashtag sentiment are explicitly modeled to enhance audience-relevance and emotional resonance. The approach integrates BERTopic-based topic embeddings, GraphSAGE structural representations of hashtags, BERTopic-derived topical features, and Stanford CoreNLP sentiment, fused via PCA-reduced social features and a deep CNN–DNN cascade. Empirical results on TPIC and SMP datasets show substantial improvements over state-of-the-art baselines in MSE and MAE, with strong statistical correlations and insightful ablation analyses validating the contributions of visual demographics and hashtag sentiment. The findings highlight the practical impact for content recommendation, trend analysis, and targeted engagement, while also outlining limitations and directions for future efficiency and cross-domain applicability.

Abstract

Social media users articulate their opinions on a broad spectrum of subjects and share their experiences through posts comprising multiple modes of expression, leading to a notable surge in such multimodal content on social media platforms. Nonetheless, accurately forecasting the popularity of these posts presents a considerable challenge. Prevailing methodologies primarily center on the content itself, thereby overlooking the wealth of information encapsulated within alternative modalities such as visual demographics, sentiments conveyed through hashtags and adequately modeling the intricate relationships among hashtags, texts, and accompanying images. This oversight limits the ability to capture emotional connection and audience relevance, significantly influencing post popularity. To address these limitations, we propose a seNtiment and hAshtag-aware attentive deep neuRal netwoRk for multimodAl posT pOpularity pRediction, herein referred to as NARRATOR that extracts visual demographics from faces appearing in images and discerns sentiment from hashtag usage, providing a more comprehensive understanding of the factors influencing post popularity Moreover, we introduce a hashtag-guided attention mechanism that leverages hashtags as navigational cues, guiding the models focus toward the most pertinent features of textual and visual modalities, thus aligning with target audience interests and broader social media context. Experimental results demonstrate that NARRATOR outperforms existing methods by a significant margin on two real-world datasets. Furthermore, ablation studies underscore the efficacy of integrating visual demographics, sentiment analysis of hashtags, and hashtag-guided attention mechanisms in enhancing the performance of post popularity prediction, thereby facilitating increased audience relevance, emotional engagement, and aesthetic appeal.

Sentiment and Hashtag-aware Attentive Deep Neural Network for Multimodal Post Popularity Prediction

TL;DR

This work tackles multimodal post popularity prediction by introducing NARRATOR, a model that jointly leverages textual and visual content, demographic cues inferred from faces, and downstream signals from hashtags, including sentiment. A novel hashtag-guided attention mechanism directs attention across text and images using hashtag context, while visual demographics and hashtag sentiment are explicitly modeled to enhance audience-relevance and emotional resonance. The approach integrates BERTopic-based topic embeddings, GraphSAGE structural representations of hashtags, BERTopic-derived topical features, and Stanford CoreNLP sentiment, fused via PCA-reduced social features and a deep CNN–DNN cascade. Empirical results on TPIC and SMP datasets show substantial improvements over state-of-the-art baselines in MSE and MAE, with strong statistical correlations and insightful ablation analyses validating the contributions of visual demographics and hashtag sentiment. The findings highlight the practical impact for content recommendation, trend analysis, and targeted engagement, while also outlining limitations and directions for future efficiency and cross-domain applicability.

Abstract

Social media users articulate their opinions on a broad spectrum of subjects and share their experiences through posts comprising multiple modes of expression, leading to a notable surge in such multimodal content on social media platforms. Nonetheless, accurately forecasting the popularity of these posts presents a considerable challenge. Prevailing methodologies primarily center on the content itself, thereby overlooking the wealth of information encapsulated within alternative modalities such as visual demographics, sentiments conveyed through hashtags and adequately modeling the intricate relationships among hashtags, texts, and accompanying images. This oversight limits the ability to capture emotional connection and audience relevance, significantly influencing post popularity. To address these limitations, we propose a seNtiment and hAshtag-aware attentive deep neuRal netwoRk for multimodAl posT pOpularity pRediction, herein referred to as NARRATOR that extracts visual demographics from faces appearing in images and discerns sentiment from hashtag usage, providing a more comprehensive understanding of the factors influencing post popularity Moreover, we introduce a hashtag-guided attention mechanism that leverages hashtags as navigational cues, guiding the models focus toward the most pertinent features of textual and visual modalities, thus aligning with target audience interests and broader social media context. Experimental results demonstrate that NARRATOR outperforms existing methods by a significant margin on two real-world datasets. Furthermore, ablation studies underscore the efficacy of integrating visual demographics, sentiment analysis of hashtags, and hashtag-guided attention mechanisms in enhancing the performance of post popularity prediction, thereby facilitating increased audience relevance, emotional engagement, and aesthetic appeal.

Paper Structure

This paper contains 44 sections, 21 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: Example Social Media Post
  • Figure 2: System Architecture of NARRATOR
  • Figure 3: Posts depicting Demographic Features
  • Figure 4: Feature Fusion
  • Figure 5: Deep Feedforward Neural Network for Popularity Score Prediction
  • ...and 5 more figures