Table of Contents
Fetching ...

WebRec: Enhancing LLM-based Recommendations with Attention-guided RAG from Web

Zihuai Zhao, Yujuan Ding, Wenqi Fan, Qing Li

TL;DR

WebRec tackles the knowledge-gap problem in LLM-based recommender systems by integrating up-to-date web information through a training-free retrieval stage and by addressing noisy web content during generation with an attention-guided MP-Head. The retrieval stage converts recommendation prompts into semantically rich web queries using $s_i = s_i^{\mathrm{attention}} \cdot s_i^{\mathrm{entropy}}$ to select high-signal keywords, which are used to query web sources. The generation stage introduces MP-Head, an extra attention head that performs one-hop message passing over an entity–relation graph built from KV tokens and a learnable task feature $z$, enabling robust modeling of long-distance dependencies in noisy web content with $a_i^{\mathrm{MP}} = \mathrm{CONCAT}(\mathrm{head}_{\mathrm{MP}}(x_i), a_i)$ and $m_i^{(l)}$ updates. Empirical results on four Amazon-domain datasets show WebRec achieving state-of-the-art HR and NDCG across multiple web sources, validating the practicality and effectiveness of attention-guided web retrieval for LLM-based recommendations.

Abstract

Recommender systems play a vital role in alleviating information overload and enriching users' online experience. In the era of large language models (LLMs), LLM-based recommender systems have emerged as a prevalent paradigm for advancing personalized recommendations. Recently, retrieval-augmented generation (RAG) has drawn growing interest to facilitate the recommendation capability of LLMs, incorporating useful information retrieved from external knowledge bases. However, as a rich source of up-to-date information, the web remains under-explored by existing RAG-based recommendations. In particular, unique challenges are posed from two perspectives: one is to generate effective queries for web retrieval, considering the inherent knowledge gap between web search and recommendations; another challenge lies in harnessing online websites that contain substantial noisy content. To tackle these limitations, we propose WebRec, a novel web-based RAG framework, which takes advantage of the reasoning capability of LLMs to interpret recommendation tasks into queries of user preferences that cater to web retrieval. Moreover, given noisy web-retrieved information, where relevant pieces of evidence are scattered far apart, an insightful MP-Head is designed to enhance LLM attentions between distant tokens of relevant information via message passing. Extensive experiments have been conducted to demonstrate the effectiveness of our proposed web-based RAG methods in recommendation scenarios.

WebRec: Enhancing LLM-based Recommendations with Attention-guided RAG from Web

TL;DR

WebRec tackles the knowledge-gap problem in LLM-based recommender systems by integrating up-to-date web information through a training-free retrieval stage and by addressing noisy web content during generation with an attention-guided MP-Head. The retrieval stage converts recommendation prompts into semantically rich web queries using to select high-signal keywords, which are used to query web sources. The generation stage introduces MP-Head, an extra attention head that performs one-hop message passing over an entity–relation graph built from KV tokens and a learnable task feature , enabling robust modeling of long-distance dependencies in noisy web content with and updates. Empirical results on four Amazon-domain datasets show WebRec achieving state-of-the-art HR and NDCG across multiple web sources, validating the practicality and effectiveness of attention-guided web retrieval for LLM-based recommendations.

Abstract

Recommender systems play a vital role in alleviating information overload and enriching users' online experience. In the era of large language models (LLMs), LLM-based recommender systems have emerged as a prevalent paradigm for advancing personalized recommendations. Recently, retrieval-augmented generation (RAG) has drawn growing interest to facilitate the recommendation capability of LLMs, incorporating useful information retrieved from external knowledge bases. However, as a rich source of up-to-date information, the web remains under-explored by existing RAG-based recommendations. In particular, unique challenges are posed from two perspectives: one is to generate effective queries for web retrieval, considering the inherent knowledge gap between web search and recommendations; another challenge lies in harnessing online websites that contain substantial noisy content. To tackle these limitations, we propose WebRec, a novel web-based RAG framework, which takes advantage of the reasoning capability of LLMs to interpret recommendation tasks into queries of user preferences that cater to web retrieval. Moreover, given noisy web-retrieved information, where relevant pieces of evidence are scattered far apart, an insightful MP-Head is designed to enhance LLM attentions between distant tokens of relevant information via message passing. Extensive experiments have been conducted to demonstrate the effectiveness of our proposed web-based RAG methods in recommendation scenarios.

Paper Structure

This paper contains 33 sections, 17 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Illustration of web-enhanced RAG for recommendations. In addition to domain-specific knowledge from recommender systems, online websites offer a distinct advantage by providing access to up-to-date data as external knowledge. This contributes to fulfilling the timely information needs in recommendations, such as the latest customer feedback online, to facilitate the understanding of user preferences.
  • Figure 2: Overview of the retrieval stage of WebRec. In step 1, we take advantage of the reasoning capability of LLMs to interpret recommendation tasks as semantic queries. In step 2, retrieval queries can be generated to address the information needs of LLMs by carefully scoring generated tokens, without fine-tuning or massive prompt engineering for retrieval tasks. In step 3, the web retrieval can be conducted via search APIs.
  • Figure 3: Overview of the generation stage of WebRec. On the left, we illustrate the pipeline continued from web retrieval that Step 1-3 in Fig. \ref{['fig:retrieval']}. In the right block, the proposed framework of LLM's transformer block with MP-Head is presented, where MP-Head takes the original attention features as inputs and models their long-distance dependencies via message passing. Notably, the learned MP-Head output can be seamlessly integrated into the attention output to facilitate LLM-based recommendations over noisy web information.
  • Figure 4: Ablation on Different Retrieval Strategies.
  • Figure 5: Ablation on Different Generation Strategies.
  • ...and 1 more figures