Table of Contents
Fetching ...

Uncertainty-Aware Explainable Recommendation with Large Language Models

Yicui Peng, Hao Chen, Chingsheng Lin, Guo Huang, Jinrong Hu, Hui Guo, Bin Kong, Shu Hu, Xi Wu, Xin Wang

TL;DR

The paper tackles explainable recommendations with resource-efficient LLMs by using continuous prompts formed from user/item ID vectors fed into GPT-2, and jointly trains rating prediction and explanation generation under an uncertainty-based loss. The approach leverages multi-task learning to transfer user interests to explanations, achieving superior explainability metrics (e.g., USR, FCR, DIV) while preserving textual quality across Yelp, TripAdvisor, and Amazon datasets. Key contributions include (i) continuous prompts for explanations, (ii) a joint rating+explanation objective with trainable uncertainty weights, and (iii) empirical evidence of improved explainability and stable text generation without heavy LLM fine-tuning. The results demonstrate practical impact for scalable, interpretable recommendations in real-world systems.

Abstract

Providing explanations within the recommendation system would boost user satisfaction and foster trust, especially by elaborating on the reasons for selecting recommended items tailored to the user. The predominant approach in this domain revolves around generating text-based explanations, with a notable emphasis on applying large language models (LLMs). However, refining LLMs for explainable recommendations proves impractical due to time constraints and computing resource limitations. As an alternative, the current approach involves training the prompt rather than the LLM. In this study, we developed a model that utilizes the ID vectors of user and item inputs as prompts for GPT-2. We employed a joint training mechanism within a multi-task learning framework to optimize both the recommendation task and explanation task. This strategy enables a more effective exploration of users' interests, improving recommendation effectiveness and user satisfaction. Through the experiments, our method achieving 1.59 DIV, 0.57 USR and 0.41 FCR on the Yelp, TripAdvisor and Amazon dataset respectively, demonstrates superior performance over four SOTA methods in terms of explainability evaluation metric. In addition, we identified that the proposed model is able to ensure stable textual quality on the three public datasets.

Uncertainty-Aware Explainable Recommendation with Large Language Models

TL;DR

The paper tackles explainable recommendations with resource-efficient LLMs by using continuous prompts formed from user/item ID vectors fed into GPT-2, and jointly trains rating prediction and explanation generation under an uncertainty-based loss. The approach leverages multi-task learning to transfer user interests to explanations, achieving superior explainability metrics (e.g., USR, FCR, DIV) while preserving textual quality across Yelp, TripAdvisor, and Amazon datasets. Key contributions include (i) continuous prompts for explanations, (ii) a joint rating+explanation objective with trainable uncertainty weights, and (iii) empirical evidence of improved explainability and stable text generation without heavy LLM fine-tuning. The results demonstrate practical impact for scalable, interpretable recommendations in real-world systems.

Abstract

Providing explanations within the recommendation system would boost user satisfaction and foster trust, especially by elaborating on the reasons for selecting recommended items tailored to the user. The predominant approach in this domain revolves around generating text-based explanations, with a notable emphasis on applying large language models (LLMs). However, refining LLMs for explainable recommendations proves impractical due to time constraints and computing resource limitations. As an alternative, the current approach involves training the prompt rather than the LLM. In this study, we developed a model that utilizes the ID vectors of user and item inputs as prompts for GPT-2. We employed a joint training mechanism within a multi-task learning framework to optimize both the recommendation task and explanation task. This strategy enables a more effective exploration of users' interests, improving recommendation effectiveness and user satisfaction. Through the experiments, our method achieving 1.59 DIV, 0.57 USR and 0.41 FCR on the Yelp, TripAdvisor and Amazon dataset respectively, demonstrates superior performance over four SOTA methods in terms of explainability evaluation metric. In addition, we identified that the proposed model is able to ensure stable textual quality on the three public datasets.
Paper Structure (14 sections, 10 equations, 2 figures, 3 tables)

This paper contains 14 sections, 10 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: RS and ER stand for Recommendation System and Explainable Recommendation respectively. Our goal is to leverage the power of LLM to generate natural language sentences that explain recommendations based on user-item pairs provided by RS.
  • Figure 2: The structure of multi-task learning involves the generation of a natural language sentence that explains why item $i$ is recommended to user $u$, which belongs to the Seq2Seq framework. Firstly, we use RS (MF) to obtain the rating prediction $\hat{r}_{u,i}$ , which is the result of the inner product between the user $u$ and item $i$. The loss function for rating prediction is calculated using the mean square deviation, as shown in \ref{['eq7']}. Next, the user $u$ and item $i$ are treated as two special tokens for vectorization and serve as continuous prompts for one of the inputs, represented as $[u,i]$. Additionally, the other part of the input is the text of the recommendation explanation, denoted as $[e_{1}, \cdots , e_{n-1}]$. The overall input is then passed through the LLM (GPT-2) and followed by a softmax fully connected layer. The output generates tokens sequence by sequence, starting from step 1 to each token's final representation $O_{s}$ for next-word prediction. The objective function for optimization is the negative log-likelihood loss, as expressed in \ref{['eq4']}. Finally, a joint training mechanism within a multi-task learning framework is employed to optimize the loss function, as described in \ref{['eq10']}.