Table of Contents
Fetching ...

A Comprehensive Survey on Retrieval Methods in Recommender Systems

Junjie Huang, Jizheng Chen, Jianghao Lin, Jiarui Qin, Ziming Feng, Weinan Zhang, Yong Yu

TL;DR

This work addresses the retrieval stage in cascade recommender systems, an area historically underexplored compared to ranking. It synthesizes methods across three pillars—similarity learning (shallow and deep), indexing, and optimization—and provides benchmarks on three public datasets plus an industrial case study. Key contributions include a comprehensive taxonomy, extensive cross-method benchmarking, and a detailed industrial practices case study with practical insights and challenges. The survey offers actionable guidance for researchers and practitioners to design efficient, scalable retrieval components that substantially impact overall recommendation quality and user experience.

Abstract

In an era dominated by information overload, effective recommender systems are essential for managing the deluge of data across digital platforms. Multi-stage cascade ranking systems are widely used in the industry, with retrieval and ranking being two typical stages. Retrieval methods sift through vast candidates to filter out irrelevant items, while ranking methods prioritize these candidates to present the most relevant items to users. Unlike studies focusing on the ranking stage, this survey explores the critical yet often overlooked retrieval stage of recommender systems. To achieve precise and efficient personalized retrieval, we summarize existing work in three key areas: improving similarity computation between user and item, enhancing indexing mechanisms for efficient retrieval, and optimizing training methods of retrieval. We also provide a comprehensive set of benchmarking experiments on three public datasets. Furthermore, we highlight current industrial applications through a case study on retrieval practices at a specific company, covering the entire retrieval process and online serving, along with practical implications and challenges. By detailing the retrieval stage, which is fundamental for effective recommendation, this survey aims to bridge the existing knowledge gap and serve as a cornerstone for researchers interested in optimizing this critical component of cascade recommender systems.

A Comprehensive Survey on Retrieval Methods in Recommender Systems

TL;DR

This work addresses the retrieval stage in cascade recommender systems, an area historically underexplored compared to ranking. It synthesizes methods across three pillars—similarity learning (shallow and deep), indexing, and optimization—and provides benchmarks on three public datasets plus an industrial case study. Key contributions include a comprehensive taxonomy, extensive cross-method benchmarking, and a detailed industrial practices case study with practical insights and challenges. The survey offers actionable guidance for researchers and practitioners to design efficient, scalable retrieval components that substantially impact overall recommendation quality and user experience.

Abstract

In an era dominated by information overload, effective recommender systems are essential for managing the deluge of data across digital platforms. Multi-stage cascade ranking systems are widely used in the industry, with retrieval and ranking being two typical stages. Retrieval methods sift through vast candidates to filter out irrelevant items, while ranking methods prioritize these candidates to present the most relevant items to users. Unlike studies focusing on the ranking stage, this survey explores the critical yet often overlooked retrieval stage of recommender systems. To achieve precise and efficient personalized retrieval, we summarize existing work in three key areas: improving similarity computation between user and item, enhancing indexing mechanisms for efficient retrieval, and optimizing training methods of retrieval. We also provide a comprehensive set of benchmarking experiments on three public datasets. Furthermore, we highlight current industrial applications through a case study on retrieval practices at a specific company, covering the entire retrieval process and online serving, along with practical implications and challenges. By detailing the retrieval stage, which is fundamental for effective recommendation, this survey aims to bridge the existing knowledge gap and serve as a cornerstone for researchers interested in optimizing this critical component of cascade recommender systems.
Paper Structure (56 sections, 28 equations, 8 figures, 3 tables)

This paper contains 56 sections, 28 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: The multi-stage architecture in modern recommender systems and the illustration of multi-channel retrieval. The latter will be detailed further in Section \ref{['sec:mc']}.
  • Figure 2: Distributions of the reviewed papers of retrieval methods in recommender systems over years (a) and over venues (b).
  • Figure 3: Taxonomy for retrieval methods in recommender systems.
  • Figure 4: An illustration of different retrieval strategies and input data for retrieval. Figure (a) illustrates non-personalized retrieval, a strategy that offers the same recommendation list to different users, such as 'trending hot', without tailoring to individual user interests. Figure (b) depicts personalized retrieval, which includes three retrieval strategies: U2I, U2U2I, and U2I2I (I2I). Figure (c) shows the user-item rating matrix, which is the core data or information used in retrieval methods. Figure (d) presents information beyond the user-item rating matrix, commonly involving side information such as user profiles, item attributes, and context information.
  • Figure 5: The architecture of deep retrieval methods focused on similarity learning.
  • ...and 3 more figures