Table of Contents
Fetching ...

HI-Series Algorithms A Hybrid of Substance Diffusion Algorithm and Collaborative Filtering

Yu Peng, Ya-Hui An

TL;DR

This work tackles the persistent accuracy-diversity trade-off in recommender systems by introducing HI-series hybrids that nonlinearly fuse item-based collaborative filtering with diffusion-based recommendation methods. The framework extends to HI-MD, HI-HHP, HI-BHC, and HI-BD, using a mixing parameter ε to balance diversity and accuracy, and demonstrates superior performance on MovieLens, Netflix, and RYM compared with baseline methods. Empirical results show that HI-MD particularly boosts performance in sparse data, while HI-BD shines in dense data, with overall improvements in F1-Score and Diversity metrics and enhanced novelty. The findings validate nonlinear hybridization as a robust, adaptable strategy for optimizing recommendation quality across varying data sparsity levels.

Abstract

Recommendation systems face the challenge of balancing accuracy and diversity, as traditional collaborative filtering (CF) and network-based diffusion algorithms exhibit complementary limitations. While item-based CF (ItemCF) enhances diversity through item similarity, it compromises accuracy. Conversely, mass diffusion (MD) algorithms prioritize accuracy by favoring popular items but lack diversity. To address this trade-off, we propose the HI-series algorithms, hybrid models integrating ItemCF with diffusion-based approaches (MD, HHP, BHC, BD) through a nonlinear combination controlled by parameter $ε$. This hybridization leverages ItemCF's diversity and MD's accuracy, extending to advanced diffusion models (HI-HHP, HI-BHC, HI-BD) for enhanced performance. Experiments on MovieLens, Netflix, and RYM datasets demonstrate that HI-series algorithms significantly outperform their base counterparts. In sparse data ($20\%$ training), HI-MD achieves a $0.8\%$-$4.4\%$ improvement in F1-score over MD while maintaining higher diversity (Diversity@20: 459 vs. 396 on MovieLens). For dense data ($80\%$ training), HI-BD improves F1-score by $2.3\%$-$5.2\%$ compared to BD, with diversity gains up to $18.6\%$. Notably, hybrid models consistently enhance novelty in sparse settings and exhibit robust parameter adaptability. The results validate that strategic hybridization effectively breaks the accuracy-diversity trade-off, offering a flexible framework for optimizing recommendation systems across data sparsity levels.

HI-Series Algorithms A Hybrid of Substance Diffusion Algorithm and Collaborative Filtering

TL;DR

This work tackles the persistent accuracy-diversity trade-off in recommender systems by introducing HI-series hybrids that nonlinearly fuse item-based collaborative filtering with diffusion-based recommendation methods. The framework extends to HI-MD, HI-HHP, HI-BHC, and HI-BD, using a mixing parameter ε to balance diversity and accuracy, and demonstrates superior performance on MovieLens, Netflix, and RYM compared with baseline methods. Empirical results show that HI-MD particularly boosts performance in sparse data, while HI-BD shines in dense data, with overall improvements in F1-Score and Diversity metrics and enhanced novelty. The findings validate nonlinear hybridization as a robust, adaptable strategy for optimizing recommendation quality across varying data sparsity levels.

Abstract

Recommendation systems face the challenge of balancing accuracy and diversity, as traditional collaborative filtering (CF) and network-based diffusion algorithms exhibit complementary limitations. While item-based CF (ItemCF) enhances diversity through item similarity, it compromises accuracy. Conversely, mass diffusion (MD) algorithms prioritize accuracy by favoring popular items but lack diversity. To address this trade-off, we propose the HI-series algorithms, hybrid models integrating ItemCF with diffusion-based approaches (MD, HHP, BHC, BD) through a nonlinear combination controlled by parameter . This hybridization leverages ItemCF's diversity and MD's accuracy, extending to advanced diffusion models (HI-HHP, HI-BHC, HI-BD) for enhanced performance. Experiments on MovieLens, Netflix, and RYM datasets demonstrate that HI-series algorithms significantly outperform their base counterparts. In sparse data ( training), HI-MD achieves a - improvement in F1-score over MD while maintaining higher diversity (Diversity@20: 459 vs. 396 on MovieLens). For dense data ( training), HI-BD improves F1-score by - compared to BD, with diversity gains up to . Notably, hybrid models consistently enhance novelty in sparse settings and exhibit robust parameter adaptability. The results validate that strategic hybridization effectively breaks the accuracy-diversity trade-off, offering a flexible framework for optimizing recommendation systems across data sparsity levels.

Paper Structure

This paper contains 9 sections, 6 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Changes of F1-Score@20, Diversity@20 and Novelty@20 with $\lambda$ on MovieLens. (a) F1-Score@20; (b) Diversity@20; (c) Novelty@20.
  • Figure 2: Changes of F1-Score@20, Diversity@20 and Novelty@20 with $\lambda$ on Netflix. (a) F1-Score@20; (b) Diversity@20; (c) Novelty@20.
  • Figure 3: Changes of F1-Score@20, Diversity@20 and Novelty@20 with $\lambda$ on RYM. (a) F1-Score@20; (b) Diversity@20; (c) Novelty@20.
  • Figure 4: Changes of F1-Score@20, Diversity@20, and Novelty@20 with respect to $\epsilon$ on the MovieLens dataset are as follows: (a) F1-Score@20; (b) Diversity@20; (c) Novelty@20.
  • Figure 5: Changes of F1-Score@20, Diversity@20, and Novelty@20 with $\epsilon$ on Netflix. (a) F1-Score@20; (b) Diversity@20; (c) Novelty@20
  • ...and 1 more figures