A Comparative Study of Recommender Systems under Big Data Constraints
Arimondo Scrivano
TL;DR
This paper tackles the problem of selecting recommender systems under Big Data constraints by empirically comparing state-of-the-art algorithms (EASE-R, SLIM, SLIM‑ElasticNet, FunkSVD, ALS, RP3beta) across large-scale datasets using metrics such as $Precision@K$, $Recall@K$, $NDCG@K$, and $MAP@K$. It demonstrates a clear trade-off: SLIM variants achieve top accuracy and interpretability but incur high training time and memory, while EASE-R and RP3beta offer a favorable balance of performance and scalability suitable for real-time industrial deployment; MF/ALS provide robust performance with higher resource demands, and graph-based methods like RP3beta excel in latency-sensitive contexts. The study provides practical guidelines for model selection based on data growth, latency, and infrastructure, and discusses implications for future-proofing, including quantum-ready infrastructures. Overall, no single model dominates across all criteria, underscoring the need to align recommender choice with system constraints and deployment goals in Big Data environments.
Abstract
Recommender Systems (RS) have become essential tools in a wide range of digital services, from e-commerce and streaming platforms to news and social media. As the volume of user-item interactions grows exponentially, especially in Big Data environments, selecting the most appropriate RS model becomes a critical task. This paper presents a comparative study of several state-of-the-art recommender algorithms, including EASE-R, SLIM, SLIM with ElasticNet regularization, Matrix Factorization (FunkSVD and ALS), P3Alpha, and RP3Beta. We evaluate these models according to key criteria such as scalability, computational complexity, predictive accuracy, and interpretability. The analysis considers both their theoretical underpinnings and practical applicability in large-scale scenarios. Our results highlight that while models like SLIM and SLIM-ElasticNet offer high accuracy and interpretability, they suffer from high computational costs, making them less suitable for real-time applications. In contrast, algorithms such as EASE-R and RP3Beta achieve a favorable balance between performance and scalability, proving more effective in large-scale environments. This study aims to provide guidelines for selecting the most appropriate recommender approach based on specific Big Data constraints and system requirements.
