Table of Contents
Fetching ...

A Hybrid Filtering for Micro-video Hashtag Recommendation using Graph-based Deep Neural Network

Shubhi Bansal, Kushaan Gowda, Mohammad Zia Ur Rehman, Chandravardhan Singh Raghaw, Nagendra Kumar

TL;DR

This work tackles micro-video hashtag recommendation by proposing MISHON, a hybrid filtering framework that fuses content, personalized signals, and collaborative filtering within a four-type interaction graph. The method extracts modality-specific features from visual, acoustic, and textual streams, refines them through GraphSAGE, and learns joint micro-video and user representations to predict hashtags, while also addressing cold-start users via social influence. Empirical results on TMALL, INSVIDEO, and YFCC show that MISHON consistently outperforms state-of-the-art baselines in Hit Rate, Precision, Recall, and F1, with notable gains on cold-start scenarios. The approach demonstrates the practical impact of integrating multimodal content and user interdependencies for scalable, personalized hashtag recommendations across diverse platforms.

Abstract

Due to the growing volume of user generated content, hashtags are employed as topic indicators to manage content efficiently on social media platforms. However, finding these vital topics is challenging in microvideos since they contain substantial information in a short duration. Existing methods that recommend hashtags for microvideos primarily focus on content and personalization while disregarding relatedness among users. Moreover, the cold start user issue prevails in hashtag recommendation systems. Considering the above, we propose a hybrid filtering based MIcro-video haSHtag recommendatiON MISHON technique to recommend hashtags for micro-videos. Besides content based filtering, we employ user-based collaborative filtering to enhance recommendations. Since hashtags reflect users topical interests, we find similar users based on historical tagging behavior to model user relatedness. We employ a graph-based deep neural network to model user to user, modality to modality, and user to modality interactions. We then use refined modality specific and user representations to recommend pertinent hashtags for microvideos. The empirical results on three real world datasets demonstrate that MISHON attains a comparative enhancement of 3.6, 2.8, and 6.5 reported in percentage concerning the F1 score, respectively. Since cold start users exist whose historical tagging information is unavailable, we also propose a content and social influence based technique to model the relatedness of cold start users with influential users. The proposed solution shows a relative improvement of 15.8 percent in the F1 score over its content only counterpart. These results show that the proposed framework mitigates the cold start user problem.

A Hybrid Filtering for Micro-video Hashtag Recommendation using Graph-based Deep Neural Network

TL;DR

This work tackles micro-video hashtag recommendation by proposing MISHON, a hybrid filtering framework that fuses content, personalized signals, and collaborative filtering within a four-type interaction graph. The method extracts modality-specific features from visual, acoustic, and textual streams, refines them through GraphSAGE, and learns joint micro-video and user representations to predict hashtags, while also addressing cold-start users via social influence. Empirical results on TMALL, INSVIDEO, and YFCC show that MISHON consistently outperforms state-of-the-art baselines in Hit Rate, Precision, Recall, and F1, with notable gains on cold-start scenarios. The approach demonstrates the practical impact of integrating multimodal content and user interdependencies for scalable, personalized hashtag recommendations across diverse platforms.

Abstract

Due to the growing volume of user generated content, hashtags are employed as topic indicators to manage content efficiently on social media platforms. However, finding these vital topics is challenging in microvideos since they contain substantial information in a short duration. Existing methods that recommend hashtags for microvideos primarily focus on content and personalization while disregarding relatedness among users. Moreover, the cold start user issue prevails in hashtag recommendation systems. Considering the above, we propose a hybrid filtering based MIcro-video haSHtag recommendatiON MISHON technique to recommend hashtags for micro-videos. Besides content based filtering, we employ user-based collaborative filtering to enhance recommendations. Since hashtags reflect users topical interests, we find similar users based on historical tagging behavior to model user relatedness. We employ a graph-based deep neural network to model user to user, modality to modality, and user to modality interactions. We then use refined modality specific and user representations to recommend pertinent hashtags for microvideos. The empirical results on three real world datasets demonstrate that MISHON attains a comparative enhancement of 3.6, 2.8, and 6.5 reported in percentage concerning the F1 score, respectively. Since cold start users exist whose historical tagging information is unavailable, we also propose a content and social influence based technique to model the relatedness of cold start users with influential users. The proposed solution shows a relative improvement of 15.8 percent in the F1 score over its content only counterpart. These results show that the proposed framework mitigates the cold start user problem.

Paper Structure

This paper contains 32 sections, 17 equations, 4 figures, 6 tables, 2 algorithms.

Figures (4)

  • Figure 1: Example of a Micro-video from Vine with Corresponding Hashtags
  • Figure 2: Visual Representation of Problem Definition
  • Figure 3: Overall Architecture of MISHON
  • Figure 5: Example Post