Learning to Hash for Recommendation: A Survey
Fangyuan Luo, Yankai Chen, Jun Wu, Tong Li, Philip S. Yu, Xue Liu
TL;DR
This survey reviews learning to hash for recommendation (HashRec), outlining how two-tower recall models map users and items to binary codes to enable fast, large-scale retrieval. It introduces a three-tier taxonomy based on learning objectives, optimization strategies, and recommendation scenarios, and analyzes two hash search strategies: hash table lookup and hash code ranking. The paper surveys pointwise, pairwise, listwise, and heterogeneous HashRec methods, detailing forward, backward, and proximal optimization approaches and discussing applicability to cold-start, social, outfit, explainable, and federated settings. It also covers standard evaluation metrics and future directions, including general HashRec frameworks, efficiency-effectiveness trade-offs, lightweight LLM integration, multi-objective learning, and bias mitigation, and provides a public GitHub repository for reproducible review.
Abstract
With the explosive growth of users and items, Recommender Systems are facing unprecedented challenges in terms of retrieval efficiency and storage overhead. Learning to Hash techniques have emerged as a promising solution to these issues by encoding high-dimensional data into compact hash codes. As a result, hashing-based recommendation methods (HashRec) have garnered growing attention for enabling large-scale and efficient recommendation services. This survey provides a comprehensive overview of state-of-the-art HashRec algorithms. Specifically, we begin by introducing the common two-tower architecture used in the recall stage and by detailing two predominant hash search strategies. Then, we categorize existing works into a three-tier taxonomy based on: (i) learning objectives, (ii) optimization strategies, and (iii) recommendation scenarios. Additionally, we summarize widely adopted evaluation metrics for assessing both the effectiveness and efficiency of HashRec algorithms. Finally, we discuss current limitations in the field and outline promising directions for future research. We index these HashRec methods at the repository \href{https://github.com/Luo-Fangyuan/HashRec}{https://github.com/Luo-Fangyuan/HashRec}.
