Table of Contents
Fetching ...

Personalized Transformer for Explainable Recommendation

Lei Li, Yongfeng Zhang, Li Chen

TL;DR

We address the lack of personalization in Transformer-based explainable recommendation by introducing PETER, a three-task, ground-breaking approach that ties user/item IDs to linguistic explanations via a context-prediction objective and a custom attention mask. The model jointly learns explanation generation, ID-to-word mapping, and rating prediction using a small, unpretrained 2-layer Transformer, achieving strong results with improved text quality and explainability while remaining efficient. Experiments on Yelp, Amazon, and TripAdvisor show PETER and its feature-enhanced variant PETER+ outperform baselines in explanation metrics and approach or match state-of-the-art in recommendation, with ablations confirming the importance of context mapping and the masking strategy. The work demonstrates a viable path to truly personalized NLG in Transformer architectures and opens avenues for multimodal, cross-lingual, and interactive personalized systems.

Abstract

Personalization of natural language generation plays a vital role in a large spectrum of tasks, such as explainable recommendation, review summarization and dialog systems. In these tasks, user and item IDs are important identifiers for personalization. Transformer, which is demonstrated with strong language modeling capability, however, is not personalized and fails to make use of the user and item IDs since the ID tokens are not even in the same semantic space as the words. To address this problem, we present a PErsonalized Transformer for Explainable Recommendation (PETER), on which we design a simple and effective learning objective that utilizes the IDs to predict the words in the target explanation, so as to endow the IDs with linguistic meanings and to achieve personalized Transformer. Besides generating explanations, PETER can also make recommendations, which makes it a unified model for the whole recommendation-explanation pipeline. Extensive experiments show that our small unpretrained model outperforms fine-tuned BERT on the generation task, in terms of both effectiveness and efficiency, which highlights the importance and the nice utility of our design.

Personalized Transformer for Explainable Recommendation

TL;DR

We address the lack of personalization in Transformer-based explainable recommendation by introducing PETER, a three-task, ground-breaking approach that ties user/item IDs to linguistic explanations via a context-prediction objective and a custom attention mask. The model jointly learns explanation generation, ID-to-word mapping, and rating prediction using a small, unpretrained 2-layer Transformer, achieving strong results with improved text quality and explainability while remaining efficient. Experiments on Yelp, Amazon, and TripAdvisor show PETER and its feature-enhanced variant PETER+ outperform baselines in explanation metrics and approach or match state-of-the-art in recommendation, with ablations confirming the importance of context mapping and the masking strategy. The work demonstrates a viable path to truly personalized NLG in Transformer architectures and opens avenues for multimodal, cross-lingual, and interactive personalized systems.

Abstract

Personalization of natural language generation plays a vital role in a large spectrum of tasks, such as explainable recommendation, review summarization and dialog systems. In these tasks, user and item IDs are important identifiers for personalization. Transformer, which is demonstrated with strong language modeling capability, however, is not personalized and fails to make use of the user and item IDs since the ID tokens are not even in the same semantic space as the words. To address this problem, we present a PErsonalized Transformer for Explainable Recommendation (PETER), on which we design a simple and effective learning objective that utilizes the IDs to predict the words in the target explanation, so as to endow the IDs with linguistic meanings and to achieve personalized Transformer. Besides generating explanations, PETER can also make recommendations, which makes it a unified model for the whole recommendation-explanation pipeline. Extensive experiments show that our small unpretrained model outperforms fine-tuned BERT on the generation task, in terms of both effectiveness and efficiency, which highlights the importance and the nice utility of our design.

Paper Structure

This paper contains 22 sections, 7 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Attention visualization of two models when generating an explanation for the same user-item pair (see the first two columns). They are both from the last attention layer, so the target sequences are offset by one position for better illustration. The larger the attention weights, the lighter the cells.
  • Figure 2: Our proposed model PETER that contains three tasks. The input features are optional.
  • Figure 3: The attention masking used in our model that we call PETER masking. The orange box highlights its difference from the Left-to-Right masking.