Table of Contents
Fetching ...

Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models

Yunjia Xi, Weiwen Liu, Jianghao Lin, Muyan Weng, Xiaoling Cai, Hong Zhu, Jieming Zhu, Bo Chen, Ruiming Tang, Yong Yu, Weinan Zhang

TL;DR

This work tackles the open-world knowledge gap in industrial recommender systems by introducing REKI, a model-agnostic framework that sources knowledge from large language models through factorization prompting and converts it into dense, CRM-friendly augmented vectors. It distinguishes two knowledge extraction modes—individual and collective—to handle small- and large-scale scenarios, and uses a hybridized expert integration network (HEIN) to map semantic knowledge into the recommendation space. By prestoring knowledge representations, REKI achieves online latency comparable to traditional CRMs while retaining LLM-derived insights, and it demonstrates significant gains on public datasets and real-world Huawei deployments (e.g., 7% recall improvement in news and 1.99% uplift in music plays). The approach outperforms PLM-based baselines and KG-based methods, and its deployment showcases practical viability for industry-scale open-world recommendations. Overall, REKI provides a scalable, efficient pathway to fuse open-world knowledge with conventional recommender systems, with strong empirical support and real-world impact.

Abstract

Recommender systems (RSs) play a pervasive role in today's online services, yet their closed-loop nature constrains their access to open-world knowledge. Recently, large language models (LLMs) have shown promise in bridging this gap. However, previous attempts to directly implement LLMs as recommenders fall short in meeting the requirements of industrial RSs, particularly in terms of online inference latency and offline resource efficiency. Thus, we propose REKI to acquire two types of external knowledge about users and items from LLMs. Specifically, we introduce factorization prompting to elicit accurate knowledge reasoning on user preferences and items. We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption. Subsequently, generated knowledge undergoes efficient transformation and condensation into augmented vectors through a hybridized expert-integrated network, ensuring compatibility. The obtained vectors can then be used to enhance any conventional recommendation model. We also ensure efficient inference by preprocessing and prestoring the knowledge from LLMs. Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks. Now, REKI has been deployed to Huawei's news and music recommendation platforms and gained a 7% and 1.99% improvement during the online A/B test.

Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models

TL;DR

This work tackles the open-world knowledge gap in industrial recommender systems by introducing REKI, a model-agnostic framework that sources knowledge from large language models through factorization prompting and converts it into dense, CRM-friendly augmented vectors. It distinguishes two knowledge extraction modes—individual and collective—to handle small- and large-scale scenarios, and uses a hybridized expert integration network (HEIN) to map semantic knowledge into the recommendation space. By prestoring knowledge representations, REKI achieves online latency comparable to traditional CRMs while retaining LLM-derived insights, and it demonstrates significant gains on public datasets and real-world Huawei deployments (e.g., 7% recall improvement in news and 1.99% uplift in music plays). The approach outperforms PLM-based baselines and KG-based methods, and its deployment showcases practical viability for industry-scale open-world recommendations. Overall, REKI provides a scalable, efficient pathway to fuse open-world knowledge with conventional recommender systems, with strong empirical support and real-world impact.

Abstract

Recommender systems (RSs) play a pervasive role in today's online services, yet their closed-loop nature constrains their access to open-world knowledge. Recently, large language models (LLMs) have shown promise in bridging this gap. However, previous attempts to directly implement LLMs as recommenders fall short in meeting the requirements of industrial RSs, particularly in terms of online inference latency and offline resource efficiency. Thus, we propose REKI to acquire two types of external knowledge about users and items from LLMs. Specifically, we introduce factorization prompting to elicit accurate knowledge reasoning on user preferences and items. We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption. Subsequently, generated knowledge undergoes efficient transformation and condensation into augmented vectors through a hybridized expert-integrated network, ensuring compatibility. The obtained vectors can then be used to enhance any conventional recommendation model. We also ensure efficient inference by preprocessing and prestoring the knowledge from LLMs. Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks. Now, REKI has been deployed to Huawei's news and music recommendation platforms and gained a 7% and 1.99% improvement during the online A/B test.
Paper Structure (39 sections, 4 equations, 5 figures, 8 tables)

This paper contains 39 sections, 4 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: The overall framework of REKI consists of a knowledge extraction stage and a knowledge integration stage. Knowledge extraction stage leverages our designed factorization prompting as well as two kinds of knowledge extraction approaches to extract knowledge from LLMs for users and items effectively. Knowledge integration stage converts the open-world knowledge into compact and user and item representations suitable for recommendation. The user/item representations and user/item augmented vectors can be prestored for fast inference. Next, the user and item augmented vectors are integrated into an existing conventional recommendation model (CRM) as additional input features.
  • Figure 2: Example prompts for REKI. The green, purple, and yellow text bubbles represent the prompt template, the content to be filled in the template, and the real response generated by LLMs, respectively.
  • Figure 3: Comparison between knowledge from knowledge graph and LLM on MovieLens-1M dataset.
  • Figure 4: The system design for the deployment of REKI.
  • Figure 5: Ablation study about user and item knowledge on Amazon-Books dataset.