Coherency Improved Explainable Recommendation via Large Language Model
Shijie Liu, Ruixing Ding, Weihai Lu, Jun Wang, Mo Yu, Xiaoming Shi, Wei Zhang
TL;DR
The paper tackles the problem of incoherence between predicted ratings and textual explanations in explainable recommender systems. It introduces CIER, a pipeline that uses a decoder-based LLM (LLaMA2-7B) to predict ratings, embeds soft rating information into the language model via SR2WE, and generates rating-aware explanations conditioned on user-item context and the rating embedding. Training techniques—rating smoothing, curriculum learning, and multi-task loss—improve rating accuracy and explanation quality, while automatic coherence evaluation using GPT-4 and sentiment analysis enables scalable assessment of rating-explanation coherence. Empirical results on Yelp, Amazon, and TripAdvisor demonstrate improved explainability (about 7.3%) and text quality (about 4.4%), along with strong rating prediction and coherence performance, highlighting the practical value of LLM-driven, coherency-focused explainable recommendations.
Abstract
Explainable recommender systems are designed to elucidate the explanation behind each recommendation, enabling users to comprehend the underlying logic. Previous works perform rating prediction and explanation generation in a multi-task manner. However, these works suffer from incoherence between predicted ratings and explanations. To address the issue, we propose a novel framework that employs a large language model (LLM) to generate a rating, transforms it into a rating vector, and finally generates an explanation based on the rating vector and user-item information. Moreover, we propose utilizing publicly available LLMs and pre-trained sentiment analysis models to automatically evaluate the coherence without human annotations. Extensive experimental results on three datasets of explainable recommendation show that the proposed framework is effective, outperforming state-of-the-art baselines with improvements of 7.3\% in explainability and 4.4\% in text quality.
