A Comprehensive Review on Hashtag Recommendation: From Traditional to Deep Learning and Beyond
Shubhi Bansal, Kushaan Gowda, Anupama Sureshbabu K, Chirag Kothari, Nagendra Kumar
TL;DR
Hashtag recommendation has evolved from simple frequency-based methods to sophisticated multimodal and transformer-powered approaches. The paper provides a hierarchical taxonomy across modalities, problem formulations, filtering, methods, datasets, evaluation, and applications, drawing on nearly 150 studies from 2015 to 2024. It highlights a paradigm shift to transformer-based and graph-based models, augmented by retrieval and external knowledge, to address data sparsity, long-tail distributions, and rapid trend dynamics. Practical implications span content discovery, engagement, and downstream tasks such as sentiment analysis and misinformation detection, while future directions emphasize scalability, explainability, and cross-platform generalization.
Abstract
The exponential growth of user-generated content on social media platforms has precipitated significant challenges in information management, particularly in content organization, retrieval, and discovery. Hashtags, as a fundamental categorization mechanism, play a pivotal role in enhancing content visibility and user engagement. However, the development of accurate and robust hashtag recommendation systems remains a complex and evolving research challenge. Existing surveys in this domain are limited in scope and recency, focusing narrowly on specific platforms, methodologies, or timeframes. To address this gap, this review article conducts a systematic analysis of hashtag recommendation systems, comprehensively examining recent advancements across several dimensions. We investigate unimodal versus multimodal methodologies, diverse problem formulations, filtering strategies, methodological evolution from traditional frequency-based models to advanced deep learning architectures. Furthermore, we critically evaluate performance assessment paradigms, including quantitative metrics, qualitative analyses, and hybrid evaluation frameworks. Our analysis underscores a paradigm shift toward transformer-based deep learning models, which harness contextual and semantic features to achieve superior recommendation accuracy. Key challenges such as data sparsity, cold-start scenarios, polysemy, and model explainability are rigorously discussed, alongside practical applications in tweet classification, sentiment analysis, and content popularity prediction. By synthesizing insights from diverse methodological and platform-specific perspectives, this survey provides a structured taxonomy of current research, identifies unresolved gaps, and proposes future directions for developing adaptive, user-centric recommendation systems.
