DIFF: Dual Side-Information Filtering and Fusion for Sequential Recommendation
Hye-young Kim, Minjin Choi, Sunkyung Lee, Ilwoong Baek, Jongwuk Lee
TL;DR
DIFF addresses noisy signals in sequential recommendations and underutilization of item attributes in SISR, enabling more accurate next-item predictions. It introduces frequency-based noise filtering to separate low-frequency, stable interests from high-frequency fluctuations, and dual multi-sequence fusion to jointly model intra-attribute and inter-attribute correlations; an additional representation alignment loss harmonizes ID and attribute spaces. Empirical results on Yelp and four Amazon-related datasets show up to $14.1\%$ recall gains at $R@20$ and $12.5\%$ gains at $NDCG@20$, outperforming state-of-the-art SR and SISR baselines. The method demonstrates robustness to noisy histories and improved performance in cold-start and tail-item scenarios, highlighting practical benefits for real-world recommender systems.
Abstract
Side-information Integrated Sequential Recommendation (SISR) benefits from auxiliary item information to infer hidden user preferences, which is particularly effective for sparse interactions and cold-start scenarios. However, existing studies face two main challenges. (i) They fail to remove noisy signals in item sequence and (ii) they underutilize the potential of side-information integration. To tackle these issues, we propose a novel SISR model, Dual Side-Information Filtering and Fusion (DIFF), which employs frequency-based noise filtering and dual multi-sequence fusion. Specifically, we convert the item sequence to the frequency domain to filter out noisy short-term fluctuations in user interests. We then combine early and intermediate fusion to capture diverse relationships across item IDs and attributes. Thanks to our innovative filtering and fusion strategy, DIFF is more robust in learning subtle and complex item correlations in the sequence. DIFF outperforms state-of-the-art SISR models, achieving improvements of up to 14.1% and 12.5% in Recall@20 and NDCG@20 across four benchmark datasets.
