Table of Contents
Fetching ...

Diffusion-based Contrastive Learning for Sequential Recommendation

Ziqiang Cui, Haolun Wu, Bowei He, Ji Cheng, Chen Ma

TL;DR

CaDiRec tackles data sparsity and semantic drift in sequential recommendation by introducing a context-aware diffusion model that generates context-consistent augmented views for contrastive learning. The method jointly trains a Transformer-based SR model and a diffusion-based augmenter with shared item embeddings, optimizing a combined objective that includes the SR loss, a contrastive loss, and a diffusion loss. Empirical results on five benchmarks show CaDiRec achieving state-of-the-art performance, with ablations confirming the importance of context guidance, diffusion training, and contrastive learning. The approach offers practical benefits in producing realistic augmentations and robust representations across varying data sparsity levels.

Abstract

Contrastive learning has been effectively utilized to enhance the training of sequential recommendation models by leveraging informative self-supervised signals. Most existing approaches generate augmented views of the same user sequence through random augmentation and subsequently maximize their agreement in the representation space. However, these methods often neglect the rationality of the augmented samples. Due to significant uncertainty, random augmentation can disrupt the semantic information and interest evolution patterns inherent in the original user sequences. Moreover, pulling semantically inconsistent sequences closer in the representation space can render the user sequence embeddings insensitive to variations in user preferences, which contradicts the primary objective of sequential recommendation. To address these limitations, we propose the Context-aware Diffusion-based Contrastive Learning for Sequential Recommendation, named CaDiRec. The core idea is to leverage context information to generate more reasonable augmented views. Specifically, CaDiRec employs a context-aware diffusion model to generate alternative items for the given positions within a sequence. These generated items are aligned with their respective context information and can effectively replace the corresponding original items, thereby generating a positive view of the original sequence. By considering two different augmentations of the same user sequence, we can construct a pair of positive samples for contrastive learning. To ensure representation cohesion, we train the entire framework in an end-to-end manner, with shared item embeddings between the diffusion model and the recommendation model. Extensive experiments on five benchmark datasets demonstrate the advantages of our proposed method over existing baselines.

Diffusion-based Contrastive Learning for Sequential Recommendation

TL;DR

CaDiRec tackles data sparsity and semantic drift in sequential recommendation by introducing a context-aware diffusion model that generates context-consistent augmented views for contrastive learning. The method jointly trains a Transformer-based SR model and a diffusion-based augmenter with shared item embeddings, optimizing a combined objective that includes the SR loss, a contrastive loss, and a diffusion loss. Empirical results on five benchmarks show CaDiRec achieving state-of-the-art performance, with ablations confirming the importance of context guidance, diffusion training, and contrastive learning. The approach offers practical benefits in producing realistic augmentations and robust representations across varying data sparsity levels.

Abstract

Contrastive learning has been effectively utilized to enhance the training of sequential recommendation models by leveraging informative self-supervised signals. Most existing approaches generate augmented views of the same user sequence through random augmentation and subsequently maximize their agreement in the representation space. However, these methods often neglect the rationality of the augmented samples. Due to significant uncertainty, random augmentation can disrupt the semantic information and interest evolution patterns inherent in the original user sequences. Moreover, pulling semantically inconsistent sequences closer in the representation space can render the user sequence embeddings insensitive to variations in user preferences, which contradicts the primary objective of sequential recommendation. To address these limitations, we propose the Context-aware Diffusion-based Contrastive Learning for Sequential Recommendation, named CaDiRec. The core idea is to leverage context information to generate more reasonable augmented views. Specifically, CaDiRec employs a context-aware diffusion model to generate alternative items for the given positions within a sequence. These generated items are aligned with their respective context information and can effectively replace the corresponding original items, thereby generating a positive view of the original sequence. By considering two different augmentations of the same user sequence, we can construct a pair of positive samples for contrastive learning. To ensure representation cohesion, we train the entire framework in an end-to-end manner, with shared item embeddings between the diffusion model and the recommendation model. Extensive experiments on five benchmark datasets demonstrate the advantages of our proposed method over existing baselines.
Paper Structure (32 sections, 15 equations, 5 figures, 3 tables)

This paper contains 32 sections, 15 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: An example of augmented sequences with semantic discrepancies, where view 1 and view 2 are two augmented views of the original user sequence by random substitution.
  • Figure 2: Overview of our proposed CaDiRec. CaDiRec employs a context-aware diffusion model to generate reasonable augmented views for contrastive learning. The context-aware diffusion model comprises a forward process with partial position noising and a reverse process with context-conditional denoising. These designs enable the model to generate contextually appropriate substitutions for selected positions, leading to the production of reasonable augmented sequences. To effectively capture contextual dependencies, CaDiRec employs a bidirectional Transformer architecture within the diffusion model.
  • Figure 3: Hyperparameter study of $\alpha$, $\beta$, and $\rho$ on five datasets.
  • Figure 4: Performance comparison on different user groups.
  • Figure 5: Visualization of learned sequence representations.