DimeRec: A Unified Framework for Enhanced Sequential Recommendation via Generative Diffusion Models
Wuchao Li, Rui Huang, Haijun Zhao, Chi Liu, Kai Zheng, Qi Liu, Na Mou, Guorui Zhou, Defu Lian, Yang Song, Wentian Bao, Enyun Yu, Wenwu Ou
TL;DR
DimeRec reframes sequential recommendation by generating the next user interest rather than the next item, using a stationary-guidance module to extract stable signals from non-stationary histories and a diffusion-based aggregator to reconstruct recommendations. The approach introduces a Geodesic Random Walk on a spherical embedding space to align diffusion and recommendation objectives, supported by a guidance loss that stabilizes representation learning. Empirical results on three public datasets and a large-scale online deployment demonstrate substantial improvements in retrieval metrics and diversity, complemented by ablations and sensitivity analyses that validate the necessity of each component. The work delivers a practical, scalable framework that narrows the gap between discriminative SR methods and generative diffusion models, with significant potential for real-world recommender systems.
Abstract
Sequential Recommendation (SR) plays a pivotal role in recommender systems by tailoring recommendations to user preferences based on their non-stationary historical interactions. Achieving high-quality performance in SR requires attention to both item representation and diversity. However, designing an SR method that simultaneously optimizes these merits remains a long-standing challenge. In this study, we address this issue by integrating recent generative Diffusion Models (DM) into SR. DM has demonstrated utility in representation learning and diverse image generation. Nevertheless, a straightforward combination of SR and DM leads to sub-optimal performance due to discrepancies in learning objectives (recommendation vs. noise reconstruction) and the respective learning spaces (non-stationary vs. stationary). To overcome this, we propose a novel framework called DimeRec (\textbf{Di}ffusion with \textbf{m}ulti-interest \textbf{e}nhanced \textbf{Rec}ommender). DimeRec synergistically combines a guidance extraction module (GEM) and a generative diffusion aggregation module (DAM). The GEM extracts crucial stationary guidance signals from the user's non-stationary interaction history, while the DAM employs a generative diffusion process conditioned on GEM's outputs to reconstruct and generate consistent recommendations. Our numerical experiments demonstrate that DimeRec significantly outperforms established baseline methods across three publicly available datasets. Furthermore, we have successfully deployed DimeRec on a large-scale short video recommendation platform, serving hundreds of millions of users. Live A/B testing confirms that our method improves both users' time spent and result diversification.
