Table of Contents
Fetching ...

Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

Yang Zhang, Juntao You, Yimeng Bai, Jizhi Zhang, Keqin Bao, Wenjie Wang, Tat-Seng Chua

TL;DR

This study employs counterfactual reasoning to identify the causal effects of behavior sequences on model output and introduces a task that directly fits the ground-truth labels based on these effects, achieving the goal of explicit emphasis.

Abstract

Recent advancements in recommender systems have focused on leveraging Large Language Models (LLMs) to improve user preference modeling, yielding promising outcomes. However, current LLM-based approaches struggle to fully leverage user behavior sequences, resulting in suboptimal preference modeling for personalized recommendations. In this study, we propose a novel Counterfactual Fine-Tuning (CFT) method to address this issue by explicitly emphasizing the role of behavior sequences when generating recommendations. Specifically, we employ counterfactual reasoning to identify the causal effects of behavior sequences on model output and introduce a task that directly fits the ground-truth labels based on these effects, achieving the goal of explicit emphasis. Additionally, we develop a token-level weighting mechanism to adjust the emphasis strength for different item tokens, reflecting the diminishing influence of behavior sequences from earlier to later tokens during predicting an item. Extensive experiments on real-world datasets demonstrate that CFT effectively improves behavior sequence modeling. Our codes are available at https://github.com/itsmeyjt/CFT.

Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

TL;DR

This study employs counterfactual reasoning to identify the causal effects of behavior sequences on model output and introduces a task that directly fits the ground-truth labels based on these effects, achieving the goal of explicit emphasis.

Abstract

Recent advancements in recommender systems have focused on leveraging Large Language Models (LLMs) to improve user preference modeling, yielding promising outcomes. However, current LLM-based approaches struggle to fully leverage user behavior sequences, resulting in suboptimal preference modeling for personalized recommendations. In this study, we propose a novel Counterfactual Fine-Tuning (CFT) method to address this issue by explicitly emphasizing the role of behavior sequences when generating recommendations. Specifically, we employ counterfactual reasoning to identify the causal effects of behavior sequences on model output and introduce a task that directly fits the ground-truth labels based on these effects, achieving the goal of explicit emphasis. Additionally, we develop a token-level weighting mechanism to adjust the emphasis strength for different item tokens, reflecting the diminishing influence of behavior sequences from earlier to later tokens during predicting an item. Extensive experiments on real-world datasets demonstrate that CFT effectively improves behavior sequence modeling. Our codes are available at https://github.com/itsmeyjt/CFT.

Paper Structure

This paper contains 26 sections, 6 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Recommendation distribution comparison for LLM-based method BIGRec bigrec between with and without inputting historical behavior sequences on Amazon data amazon. The result reveals that whether or not behavior sequences are input leads to similar distributions of recommended items, indicating that the information is not fully utilized.
  • Figure 2: Causal graph illustrating the prediction generation process in LLM-based recommendation: the input behavior sequences $H$ and other input information $I$ (e.g., task instructions) can influence the ($t$-th token) prediction $Y_t$ directly or indirectly by activating the pre-training knowledge $E$.
  • Figure 3: An overview of the proposed CFT framework, which includes two key components: a new task (the causal loss component) introduced in a multi-task manner and a token-level weighting mechanism.
  • Figure 4: Top-20 recommendation distribution comparison between BIGRec (Baseline) and BIGRec + CFT (CFT).
  • Figure 5: Performance comparison on Steam dataset.