Towards Fair and Rigorous Evaluations: Hyperparameter Optimization for Top-N Recommendation Task with Implicit Feedback
Hui Fang, Xu Feng, Lu Qin, Zhu Sun
TL;DR
This work targets fair and rigorous evaluation of Top-N implicit recommendations by systematically hyperparameter-optimizing a set of baseline models across three diverse datasets. It compares seven HPO strategies (spanning sampling-, greedy-, and model-based families) on six recommender systems to identify which methods yield robust performance under different data characteristics. Key findings show that simple models like ItemKNN and PureSVD can outperform newer complex models when thoroughly tuned (e.g., with Anneal or TPE on small datasets), whereas complex models such as NeuMF and NGCF benefit from efficient search strategies like Hyperband or BOHB. The study advances reproducibility and fair benchmarking in recommender systems and offers practical guidance on selecting HPO approaches for various algorithm/dataset combinations.
Abstract
The widespread use of the internet has led to an overwhelming amount of data, which has resulted in the problem of information overload. Recommender systems have emerged as a solution to this problem by providing personalized recommendations to users based on their preferences and historical data. However, as recommendation models become increasingly complex, finding the best hyperparameter combination for different models has become a challenge. The high-dimensional hyperparameter search space poses numerous challenges for researchers, and failure to disclose hyperparameter settings may impede the reproducibility of research results. In this paper, we investigate the Top-N implicit recommendation problem and focus on optimizing the benchmark recommendation algorithm commonly used in comparative experiments using hyperparameter optimization algorithms. We propose a research methodology that follows the principles of a fair comparison, employing seven types of hyperparameter search algorithms to fine-tune six common recommendation algorithms on three datasets. We have identified the most suitable hyperparameter search algorithms for various recommendation algorithms on different types of datasets as a reference for later study. This study contributes to algorithmic research in recommender systems based on hyperparameter optimization, providing a fair basis for comparison.
