Table of Contents
Fetching ...

ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation

Jizheng Chen, Kounianhua Du, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang

TL;DR

ELCoRec is proposed to inject the preference understanding capability into LLM via a GAT expert model where the user preference is better encoded by parallelly propagating the temporal relations, and rating signals as well as various side information of historical items.

Abstract

Large language models have been flourishing in the natural language processing (NLP) domain, and their potential for recommendation has been paid much attention to. Despite the intelligence shown by the recommendation-oriented finetuned models, LLMs struggle to fully understand the user behavior patterns due to their innate weakness in interpreting numerical features and the overhead for long context, where the temporal relations among user behaviors, subtle quantitative signals among different ratings, and various side features of items are not well explored. Existing works only fine-tune a sole LLM on given text data without introducing that important information to it, leaving these problems unsolved. In this paper, we propose ELCoRec to Enhance Language understanding with CoPropagation of numerical and categorical features for Recommendation. Concretely, we propose to inject the preference understanding capability into LLM via a GAT expert model where the user preference is better encoded by parallelly propagating the temporal relations, and rating signals as well as various side information of historical items. The parallel propagation mechanism could stabilize heterogeneous features and offer an informative user preference encoding, which is then injected into the language models via soft prompting at the cost of a single token embedding. To further obtain the user's recent interests, we proposed a novel Recent interaction Augmented Prompt (RAP) template. Experiment results over three datasets against strong baselines validate the effectiveness of ELCoRec. The code is available at https://anonymous.4open.science/r/CIKM_Code_Repo-E6F5/README.md.

ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation

TL;DR

ELCoRec is proposed to inject the preference understanding capability into LLM via a GAT expert model where the user preference is better encoded by parallelly propagating the temporal relations, and rating signals as well as various side information of historical items.

Abstract

Large language models have been flourishing in the natural language processing (NLP) domain, and their potential for recommendation has been paid much attention to. Despite the intelligence shown by the recommendation-oriented finetuned models, LLMs struggle to fully understand the user behavior patterns due to their innate weakness in interpreting numerical features and the overhead for long context, where the temporal relations among user behaviors, subtle quantitative signals among different ratings, and various side features of items are not well explored. Existing works only fine-tune a sole LLM on given text data without introducing that important information to it, leaving these problems unsolved. In this paper, we propose ELCoRec to Enhance Language understanding with CoPropagation of numerical and categorical features for Recommendation. Concretely, we propose to inject the preference understanding capability into LLM via a GAT expert model where the user preference is better encoded by parallelly propagating the temporal relations, and rating signals as well as various side information of historical items. The parallel propagation mechanism could stabilize heterogeneous features and offer an informative user preference encoding, which is then injected into the language models via soft prompting at the cost of a single token embedding. To further obtain the user's recent interests, we proposed a novel Recent interaction Augmented Prompt (RAP) template. Experiment results over three datasets against strong baselines validate the effectiveness of ELCoRec. The code is available at https://anonymous.4open.science/r/CIKM_Code_Repo-E6F5/README.md.
Paper Structure (27 sections, 20 equations, 6 figures, 6 tables)

This paper contains 27 sections, 20 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Illustration of the motivation. LLM struggles to understand numerical features such as rating and timestamps due to numerical insensitivity. Encoding overhead fails abundant side features such as producer and genre to form the template, and time series information is lost during retrieval.
  • Figure 2: Framework overview. (a) RAP that connects user history retrieved item sequence (marked in green) and recent item sequence (marked in blue) along with the placeholder token for embedding injection (marked in orange), to form textual prompt. (b) GAT expert network. User and item' side features along with numerical infomantion are encoded and interacted. (c) Representations from GAT expert network are injected into LLM's latent space and (d) perform instruction tuning.
  • Figure 3: Item description construction. Each item's semantic description is obtained via the "Feature is Value" template.
  • Figure 4: Construction process of RAP template. $x_i^{ret}$ denotes the retrieved history items.
  • Figure 5: t-SNE visualiztion of GAT and LLM's embedding.
  • ...and 1 more figures