Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models
Yunjia Xi, Weiwen Liu, Jianghao Lin, Muyan Weng, Xiaoling Cai, Hong Zhu, Jieming Zhu, Bo Chen, Ruiming Tang, Yong Yu, Weinan Zhang
TL;DR
This work tackles the open-world knowledge gap in industrial recommender systems by introducing REKI, a model-agnostic framework that sources knowledge from large language models through factorization prompting and converts it into dense, CRM-friendly augmented vectors. It distinguishes two knowledge extraction modes—individual and collective—to handle small- and large-scale scenarios, and uses a hybridized expert integration network (HEIN) to map semantic knowledge into the recommendation space. By prestoring knowledge representations, REKI achieves online latency comparable to traditional CRMs while retaining LLM-derived insights, and it demonstrates significant gains on public datasets and real-world Huawei deployments (e.g., 7% recall improvement in news and 1.99% uplift in music plays). The approach outperforms PLM-based baselines and KG-based methods, and its deployment showcases practical viability for industry-scale open-world recommendations. Overall, REKI provides a scalable, efficient pathway to fuse open-world knowledge with conventional recommender systems, with strong empirical support and real-world impact.
Abstract
Recommender systems (RSs) play a pervasive role in today's online services, yet their closed-loop nature constrains their access to open-world knowledge. Recently, large language models (LLMs) have shown promise in bridging this gap. However, previous attempts to directly implement LLMs as recommenders fall short in meeting the requirements of industrial RSs, particularly in terms of online inference latency and offline resource efficiency. Thus, we propose REKI to acquire two types of external knowledge about users and items from LLMs. Specifically, we introduce factorization prompting to elicit accurate knowledge reasoning on user preferences and items. We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption. Subsequently, generated knowledge undergoes efficient transformation and condensation into augmented vectors through a hybridized expert-integrated network, ensuring compatibility. The obtained vectors can then be used to enhance any conventional recommendation model. We also ensure efficient inference by preprocessing and prestoring the knowledge from LLMs. Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks. Now, REKI has been deployed to Huawei's news and music recommendation platforms and gained a 7% and 1.99% improvement during the online A/B test.
