Table of Contents
Fetching ...

Preference Diffusion for Recommendation

Shuo Liu, An Zhang, Guoqing Hu, Hong Qian, Tat-seng Chua

TL;DR

This work tackles personalized ranking in diffusion-based recommender systems by reframing Bayesian Personalized Ranking as a log-likelihood objective over diffusion-generated item distributions. A variational upper bound is derived to optimize the intractable diffusion paths, and the loss is enhanced with cosine distance and multiple negatives, yielding a stable and scalable training regime. The proposed PreferDiff demonstrates substantial gains over state-of-the-art diffusion-based recommenders across multiple benchmarks and enables zero-shot transfer via text-embedding–driven variants (PreferDiff-T). The approach also connects to Direct Preference Optimization, supporting alignment of user preferences with generative modeling, and shows strong generalization when pretraining on large, diverse datasets. While effective, it notes limitations in embedding dimensionality sensitivity and hyperparameter tuning, and provides reproducibility through public code.

Abstract

Recommender systems predict personalized item rankings based on user preference distributions derived from historical behavior data. Recently, diffusion models (DMs) have gained attention in recommendation for their ability to model complex distributions, yet current DM-based recommenders often rely on traditional objectives like mean squared error (MSE) or recommendation objectives, which are not optimized for personalized ranking tasks or fail to fully leverage DM's generative potential. To address this, we propose PreferDiff, a tailored optimization objective for DM-based recommenders. PreferDiff transforms BPR into a log-likelihood ranking objective and integrates multiple negative samples to better capture user preferences. Specifically, we employ variational inference to handle the intractability through minimizing the variational upper bound and replaces MSE with cosine error to improve alignment with recommendation tasks. Finally, we balance learning generation and preference to enhance the training stability of DMs. PreferDiff offers three key benefits: it is the first personalized ranking loss designed specifically for DM-based recommenders and it improves ranking and faster convergence by addressing hard negatives. We also prove that it is theoretically connected to Direct Preference Optimization which indicates that it has the potential to align user preferences in DM-based recommenders via generative modeling. Extensive experiments across three benchmarks validate its superior recommendation performance and commendable general sequential recommendation capabilities. Our codes are available at https://github.com/lswhim/PreferDiff.

Preference Diffusion for Recommendation

TL;DR

This work tackles personalized ranking in diffusion-based recommender systems by reframing Bayesian Personalized Ranking as a log-likelihood objective over diffusion-generated item distributions. A variational upper bound is derived to optimize the intractable diffusion paths, and the loss is enhanced with cosine distance and multiple negatives, yielding a stable and scalable training regime. The proposed PreferDiff demonstrates substantial gains over state-of-the-art diffusion-based recommenders across multiple benchmarks and enables zero-shot transfer via text-embedding–driven variants (PreferDiff-T). The approach also connects to Direct Preference Optimization, supporting alignment of user preferences with generative modeling, and shows strong generalization when pretraining on large, diverse datasets. While effective, it notes limitations in embedding dimensionality sensitivity and hyperparameter tuning, and provides reproducibility through public code.

Abstract

Recommender systems predict personalized item rankings based on user preference distributions derived from historical behavior data. Recently, diffusion models (DMs) have gained attention in recommendation for their ability to model complex distributions, yet current DM-based recommenders often rely on traditional objectives like mean squared error (MSE) or recommendation objectives, which are not optimized for personalized ranking tasks or fail to fully leverage DM's generative potential. To address this, we propose PreferDiff, a tailored optimization objective for DM-based recommenders. PreferDiff transforms BPR into a log-likelihood ranking objective and integrates multiple negative samples to better capture user preferences. Specifically, we employ variational inference to handle the intractability through minimizing the variational upper bound and replaces MSE with cosine error to improve alignment with recommendation tasks. Finally, we balance learning generation and preference to enhance the training stability of DMs. PreferDiff offers three key benefits: it is the first personalized ranking loss designed specifically for DM-based recommenders and it improves ranking and faster convergence by addressing hard negatives. We also prove that it is theoretically connected to Direct Preference Optimization which indicates that it has the potential to align user preferences in DM-based recommenders via generative modeling. Extensive experiments across three benchmarks validate its superior recommendation performance and commendable general sequential recommendation capabilities. Our codes are available at https://github.com/lswhim/PreferDiff.

Paper Structure

This paper contains 37 sections, 62 equations, 12 figures, 20 tables, 2 algorithms.

Figures (12)

  • Figure 1: Illustration of user preference distributions modeled by DM-based recommenders. (a) Neglecting the negative item distribution leads to predicted items potentially being closer to negative items. (b) Incorporating the negative sampling enhances the understanding of user preferences.
  • Figure 2: Training Comparison with DreamRec on Amazon Beauty.
  • Figure 3: Effect of the Embedding Size for PreferDiff.
  • Figure 4: Positive Correlation Between Training Data Scale and General Sequential Recommendation Performance.
  • Figure 5: Effect of the $\lambda$ for PreferDiff.
  • ...and 7 more figures