Table of Contents
Fetching ...

Dual Conditional Diffusion Models for Sequential Recommendation

Hongtao Huang, Chengkai Huang, Tong Yu, Xiaojun Chang, Wen Hu, Julian McAuley, Lina Yao

TL;DR

This paper tackles sequential recommendation with diffusion models by addressing the limitations of purely implicit conditioning, which can overlook crucial sequential and contextual information. It proposes Dual Conditional Diffusion Models for Sequential Recommendation (DCRec), featuring the Dual Conditional Diffusion Transformer (DCDT) that embeds both implicit forward conditioning and explicit reverse conditioning, using input concatenation, CondLN, self-attention, and cross-attention to fuse signals. The approach achieves state-of-the-art performance on public benchmarks while reducing inference steps through early-stage approximations, improving practicality for online systems. The work demonstrates that explicitly leveraging historical guidance alongside implicit history signals yields more accurate and contextually relevant recommendations, with notable gains in both accuracy metrics and computational efficiency.

Abstract

Recent advancements in diffusion models have shown promising results in sequential recommendation (SR). Existing approaches predominantly rely on implicit conditional diffusion models, which compress user behaviors into a single representation during the forward diffusion process. While effective to some extent, this oversimplification often leads to the loss of sequential and contextual information, which is critical for understanding user behavior. Moreover, explicit information, such as user-item interactions or sequential patterns, remains underutilized, despite its potential to directly guide the recommendation process and improve precision. However, combining implicit and explicit information is non-trivial, as it requires dynamically integrating these complementary signals while avoiding noise and irrelevant patterns within user behaviors. To address these challenges, we propose Dual Conditional Diffusion Models for Sequential Recommendation (DCRec), which effectively integrates implicit and explicit information by embedding dual conditions into both the forward and reverse diffusion processes. This allows the model to retain valuable sequential and contextual information while leveraging explicit user-item interactions to guide the recommendation process. Specifically, we introduce the Dual Conditional Diffusion Transformer (DCDT), which employs a cross-attention mechanism to dynamically integrate explicit signals throughout the diffusion stages, ensuring contextual understanding and minimizing the influence of irrelevant patterns. This design enables precise and contextually relevant recommendations. Extensive experiments on public benchmark datasets demonstrate that DCRec significantly outperforms state-of-the-art methods in both accuracy and computational efficiency.

Dual Conditional Diffusion Models for Sequential Recommendation

TL;DR

This paper tackles sequential recommendation with diffusion models by addressing the limitations of purely implicit conditioning, which can overlook crucial sequential and contextual information. It proposes Dual Conditional Diffusion Models for Sequential Recommendation (DCRec), featuring the Dual Conditional Diffusion Transformer (DCDT) that embeds both implicit forward conditioning and explicit reverse conditioning, using input concatenation, CondLN, self-attention, and cross-attention to fuse signals. The approach achieves state-of-the-art performance on public benchmarks while reducing inference steps through early-stage approximations, improving practicality for online systems. The work demonstrates that explicitly leveraging historical guidance alongside implicit history signals yields more accurate and contextually relevant recommendations, with notable gains in both accuracy metrics and computational efficiency.

Abstract

Recent advancements in diffusion models have shown promising results in sequential recommendation (SR). Existing approaches predominantly rely on implicit conditional diffusion models, which compress user behaviors into a single representation during the forward diffusion process. While effective to some extent, this oversimplification often leads to the loss of sequential and contextual information, which is critical for understanding user behavior. Moreover, explicit information, such as user-item interactions or sequential patterns, remains underutilized, despite its potential to directly guide the recommendation process and improve precision. However, combining implicit and explicit information is non-trivial, as it requires dynamically integrating these complementary signals while avoiding noise and irrelevant patterns within user behaviors. To address these challenges, we propose Dual Conditional Diffusion Models for Sequential Recommendation (DCRec), which effectively integrates implicit and explicit information by embedding dual conditions into both the forward and reverse diffusion processes. This allows the model to retain valuable sequential and contextual information while leveraging explicit user-item interactions to guide the recommendation process. Specifically, we introduce the Dual Conditional Diffusion Transformer (DCDT), which employs a cross-attention mechanism to dynamically integrate explicit signals throughout the diffusion stages, ensuring contextual understanding and minimizing the influence of irrelevant patterns. This design enables precise and contextually relevant recommendations. Extensive experiments on public benchmark datasets demonstrate that DCRec significantly outperforms state-of-the-art methods in both accuracy and computational efficiency.

Paper Structure

This paper contains 34 sections, 21 equations, 9 figures, 4 tables, 2 algorithms.

Figures (9)

  • Figure 1: Figures (a) and (b) are examples of implicit and explicit conditional diffusion models for SR.
  • Figure 2: (a) illustrates a simplified workflow of our DCRec. During the end-to-end diffusion process, DCRec implicitly concatenates the noisy target item embedding $\bm{e_t}$ and the corresponding noisy history embedding sequence $\bm{H_t}$ at each diffusion step. Meanwhile, DCRec is also guided by clear history embedding $\bm{H_0}$ as explicit conditional signals. (b) depicts the forward and reverse diffusion process with conditioning. There are two denoising trajectories with implicit guidance. The dual conditional trajectory with additional explicit guidance is more likely to reach the target item (the orange area).
  • Figure 3: The design of DCDT. The black box refers to the main architecture, the red box refers to the details within the transformer block, and the yellow box refers to the details of the CondLN module.
  • Figure 4: The HR@5 results of training DCRec using different types of optimization losses on Beauty and Toys datasets.
  • Figure 5: The HR@5 results of using different modules.
  • ...and 4 more figures