Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation

Jieyong Kim; Hyunseo Kim; Hyunjin Cho; SeongKu Kang; Buru Chang; Jinyoung Yeo; Dongha Lee

Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation

Jieyong Kim, Hyunseo Kim, Hyunjin Cho, SeongKu Kang, Buru Chang, Jinyoung Yeo, Dongha Lee

TL;DR

This work addresses leveraging rich textual signals for personalized recommendations by introducing Exp3rt, an LLM-based recommender that distills a teacher model's reasoning into a student model across three steps: preference extraction from reviews, user/item profile construction, and reasoning-enabled rating prediction. It constructs structured preference profiles from reviews, combines them with item descriptions, and uses step-by-step textual reasoning to predict ratings, while also serving as an item reranker in a multi-stage CF pipeline. The approach demonstrates superior rating prediction and top-k reranking performance on IMDB and Amazon-Book, along with high-quality explanations that are faithful and persuasive. The results indicate that distillation-enabled, reasoning-focused LLMs can achieve accurate, explainable recommendations with practical efficiency and flexibility in real-world systems.

Abstract

Recent advancements in Large Language Models (LLMs) have demonstrated exceptional performance across a wide range of tasks, generating significant interest in their application to recommendation systems. However, existing methods have not fully capitalized on the potential of LLMs, often constrained by limited input information or failing to fully utilize their advanced reasoning capabilities. To address these limitations, we introduce EXP3RT, a novel LLM-based recommender designed to leverage rich preference information contained in user and item reviews. EXP3RT is basically fine-tuned through distillation from a teacher LLM to perform three key tasks in order: EXP3RT first extracts and encapsulates essential subjective preferences from raw reviews, aggregates and summarizes them according to specific criteria to create user and item profiles. It then generates detailed step-by-step reasoning followed by predicted rating, i.e., reasoning-enhanced rating prediction, by considering both subjective and objective information from user/item profiles and item descriptions. This personalized preference reasoning from EXP3RT enhances rating prediction accuracy and also provides faithful and reasonable explanations for recommendation. Extensive experiments show that EXP3RT outperforms existing methods on both rating prediction and candidate item reranking for top-k recommendation, while significantly enhancing the explainability of recommendation systems.

Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation

TL;DR

Abstract

Paper Structure (29 sections, 2 equations, 3 figures, 10 tables)

This paper contains 29 sections, 2 equations, 3 figures, 10 tables.

Introduction
Related Work
LLM-based Recommenders for Prediction
LLM-based Recommenders for Explanation
Proposed Method: Exp3rt
Preliminaries
Rating prediction
Knowledge distillation from teacher to student LLM
Preference Extraction from Reviews
User and Item Profile Construction
Reasoning-enhanced Rating Prediction
Optimization
Inference
Rating Score Prediction
Top-k Recommendation with Exp3rt
...and 14 more sections

Figures (3)

Figure 2: The overview of our Exp3rt framework. During training, we distill the reasoning capabilities of a teacher LLM (i.e., GPT-3.5) into our student LLM (i.e., Llama3-8B) for three steps: (1) Extracting preference descriptions from raw reviews, (2) constructing user/item profiles by aggregating these preferences, and (3) predicting rating scores based on textual reasoning. During inference, given a user-item pair, Exp3rt sequentially performs these steps to predict the rating; in addition, it can serve as an item reranker for top-k recommendation, compatible with other CF models that efficiently retrieve candidate items.
Figure 3: Human evaluation on pairwise quality comparison of recommendation explanations, generated by Exp3rt and baselines (*: $p$-value < 0.05).
Figure 4: Validation on different LLMs with Exp3rt input.

Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation

TL;DR

Abstract

Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)