Large Language Model as Universal Retriever in Industrial-Scale Recommender System
Junguang Jiang, Yanwen Huang, Bin Liu, Xiaoyu Kong, Xinhang Li, Ziru Xu, Han Zhu, Jian Xu, Bo Zheng
TL;DR
This work introduces the Universal Retrieval Model (URM), an LLM-based generative retrieval framework that unifies multiple retrieval objectives into a single input-output system. It enhances expressiveness through multi-query representations, improves learnability and transferability with a W=UV^T matrix decomposition, and reduces inference cost via probabilistic sampling and efficient ANN-based neighbor exploration. The approach demonstrates strong offline performance on public and industrial datasets and delivers online gains, including a 3.01% revenue increase, while maintaining tens-of-milliseconds latency. URM’s ability to adapt to various objectives and even unseen tasks underscores the potential of LLMs as universal retrievers in large-scale recommender systems. Practical deployment considerations and ablations indicate a robust, scalable pathway for industry adoption and future extension to broader objective spaces.
Abstract
In real-world recommender systems, different retrieval objectives are typically addressed using task-specific datasets with carefully designed model architectures. We demonstrate that Large Language Models (LLMs) can function as universal retrievers, capable of handling multiple objectives within a generative retrieval framework. To model complex user-item relationships within generative retrieval, we propose multi-query representation. To address the challenge of extremely large candidate sets in industrial recommender systems, we introduce matrix decomposition to boost model learnability, discriminability, and transferability, and we incorporate probabilistic sampling to reduce computation costs. Finally, our Universal Retrieval Model (URM) can adaptively generate a set from tens of millions of candidates based on arbitrary given objective while keeping the latency within tens of milliseconds. Applied to industrial-scale data, URM outperforms expert models elaborately designed for different retrieval objectives on offline experiments and significantly improves the core metric of online advertising platform by $3\%$.
