Table of Contents
Fetching ...

Enhancing Recommender Systems with Large Language Model Reasoning Graphs

Yan Wang, Zhixuan Chu, Xin Ouyang, Simeng Wang, Hongyan Hao, Yue Shen, Jinjie Gu, Siqiao Xue, James Y Zhang, Qing Cui, Longfei Li, Jun Zhou, Sheng Li

TL;DR

The paper tackles interpretability and deep semantic understanding in recommender systems by introducing LLMRG, which uses large language models to build personalized reasoning graphs that connect user profiles and behavior through causal and logical inferences. The architecture comprises four modules—chained graph reasoning, divergent extension, self-verification and scoring, and knowledge base self-improvement—whose outputs are encoded via SR-GNN and fused with a base sequential recommender. Empirical results on ML-1M, Amazon Beauty, and Amazon Clothing show consistent performance gains, with GPT-4-based LLMRG outperforming GPT-3.5 and the ablations confirming the value of each module. The work highlights the practical potential of interpretable, reasoning-driven recommendations that leverage LLMs without requiring additional user/item data, while also noting the cost and reuse considerations of LM access.

Abstract

Recommendation systems aim to provide users with relevant suggestions, but often lack interpretability and fail to capture higher-level semantic relationships between user behaviors and profiles. In this paper, we propose a novel approach that leverages large language models (LLMs) to construct personalized reasoning graphs. These graphs link a user's profile and behavioral sequences through causal and logical inferences, representing the user's interests in an interpretable way. Our approach, LLM reasoning graphs (LLMRG), has four components: chained graph reasoning, divergent extension, self-verification and scoring, and knowledge base self-improvement. The resulting reasoning graph is encoded using graph neural networks, which serves as additional input to improve conventional recommender systems, without requiring extra user or item information. Our approach demonstrates how LLMs can enable more logical and interpretable recommender systems through personalized reasoning graphs. LLMRG allows recommendations to benefit from both engineered recommendation systems and LLM-derived reasoning graphs. We demonstrate the effectiveness of LLMRG on benchmarks and real-world scenarios in enhancing base recommendation models.

Enhancing Recommender Systems with Large Language Model Reasoning Graphs

TL;DR

The paper tackles interpretability and deep semantic understanding in recommender systems by introducing LLMRG, which uses large language models to build personalized reasoning graphs that connect user profiles and behavior through causal and logical inferences. The architecture comprises four modules—chained graph reasoning, divergent extension, self-verification and scoring, and knowledge base self-improvement—whose outputs are encoded via SR-GNN and fused with a base sequential recommender. Empirical results on ML-1M, Amazon Beauty, and Amazon Clothing show consistent performance gains, with GPT-4-based LLMRG outperforming GPT-3.5 and the ablations confirming the value of each module. The work highlights the practical potential of interpretable, reasoning-driven recommendations that leverage LLMs without requiring additional user/item data, while also noting the cost and reuse considerations of LM access.

Abstract

Recommendation systems aim to provide users with relevant suggestions, but often lack interpretability and fail to capture higher-level semantic relationships between user behaviors and profiles. In this paper, we propose a novel approach that leverages large language models (LLMs) to construct personalized reasoning graphs. These graphs link a user's profile and behavioral sequences through causal and logical inferences, representing the user's interests in an interpretable way. Our approach, LLM reasoning graphs (LLMRG), has four components: chained graph reasoning, divergent extension, self-verification and scoring, and knowledge base self-improvement. The resulting reasoning graph is encoded using graph neural networks, which serves as additional input to improve conventional recommender systems, without requiring extra user or item information. Our approach demonstrates how LLMs can enable more logical and interpretable recommender systems through personalized reasoning graphs. LLMRG allows recommendations to benefit from both engineered recommendation systems and LLM-derived reasoning graphs. We demonstrate the effectiveness of LLMRG on benchmarks and real-world scenarios in enhancing base recommendation models.
Paper Structure (21 sections, 1 equation, 6 figures, 7 tables)

This paper contains 21 sections, 1 equation, 6 figures, 7 tables.

Figures (6)

  • Figure 1: LLMRG framework has two main components, i.e., an adaptive reasoning module with self-verification and a base sequential recommendation model. Our model concatenates the embeddings from the adaptive reasoning module ($E_{ori}$ and $E_{div}$) and the base model ($E_{base}$) to obtain $E_{fusion}$. This fused embedding is used to predict the next item for the user. The key advantage of our approach is that the adaptive reasoning module can construct personalized reasoning graphs, going beyond the sequential modeling of user interests. The self-verification and scoring help improve the reasoning process. Fusing this with a standard recommendation model allows for combining complementary strengths without accessing extra information.
  • Figure 2: The real case studies (ML-1M) on our (a) LLMRG and ablation models, i.e., (b) LLMRG w/o divergent extension and (c) LLMRG w/o self-verification. The black arrow represents the reasoning procedure. The red arrow is the divergent extension. The green dashed arrow refers to the abductive reasoning in the self-verification module.
  • Figure 3: Sensitivity analysis of threshold of verification scoring $\tau$ and sequence truncation length $l_{tru}$ on HR and NDCG performance based on ML-1M and Amazon Beauty benchmarks.
  • Figure 4: The average access frequency of LLM based on ML-1M and Amazon Beauty benchmarks.
  • Figure 5: The prompt examples and real cases for graph reasoning, self-verification, scoring, and divergent extension modules. This is the input structure of LLM, which is accessible via a string to the API and divided into three components: task description, example input, and example output. The task description appears first and indicates the type of task being requested, providing critical context that constrains LLM's behavior. The example input demonstrates specific content that LLM should respond to this task. Finally, the example output illustrates the desired form the reply should take for this prompt.
  • ...and 1 more figures