Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment

Yi Fang; Wenjie Wang; Yang Zhang; Fengbin Zhu; Qifan Wang; Fuli Feng; Xiangnan He

Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment

Yi Fang, Wenjie Wang, Yang Zhang, Fengbin Zhu, Qifan Wang, Fuli Feng, Xiangnan He

TL;DR

This work tackles the limitations of RecLLMs arising from optimizing only for direct user-feedback prediction, which can hinder reliability in complex scenarios. It introduces Deliberative Recommendation and the Reason4Rec framework, which decomposes reasoning into three collaborative steps—Summarizer, Reasoner, and Predictor—guided by verbalized user feedback (reviews) and implemented via three QLoRA adapters. Across three real-world datasets, Reason4Rec demonstrates improved rating prediction accuracy (lower MAE/RMSE) and higher reasoning quality (BLEURT, GPTScore) compared with traditional, review-based, and other LLM-based baselines, validating the value of slow, multi-step deliberation. The approach advances interpretability and reliability in RecLLMs and points to future work on richer verbalized feedback, efficiency optimizations, and interactive human-AI learning settings.

Abstract

While recent advancements in aligning Large Language Models (LLMs) with recommendation tasks have shown great potential and promising performance overall, these aligned recommendation LLMs still face challenges in complex scenarios. This is primarily due to the current alignment approach focusing on optimizing LLMs to generate user feedback directly, without incorporating deliberation. To overcome this limitation and develop more reliable LLMs for recommendations, we propose a new Deliberative Recommendation task, which incorporates explicit reasoning about user preferences as an additional alignment goal. We then introduce the Reasoning-powered Recommender framework for deliberative user preference alignment, designed to enhance reasoning capabilities by utilizing verbalized user feedback in a step-wise manner to tackle this task. The framework employs collaborative step-wise experts and tailored training strategies for each expert. Experimental results across three real-world datasets demonstrate the rationality of the deliberative task formulation and the superior performance of the proposed framework in improving both prediction accuracy and reasoning quality.

Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment

TL;DR

Abstract

Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)