Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy
Tingjia Shen, Hao Wang, Chuhan Wu, Jin Yao Chin, Wei Guo, Yong Liu, Huifeng Guo, Defu Lian, Ruiming Tang, Enhong Chen
TL;DR
This work tackles two core SR challenges: the mismatch between model loss-based scaling laws and actual SR performance, and the detrimental impact of data redundancy on SR outcomes. By formulating a Performance Law that fits SR performance metrics such as HR@10 and NDCG@10, and by introducing Approximate Entropy as a data-quality measure, the authors enable accurate performance predictions across model sizes and dataset scales. They validate the approach with transformer-based SR models, showing strong correlations between the data proxy $D'=#Tokens\cdot ApEn'$ and performance across diverse datasets, and demonstrate practical applications in global/local parameter optimization and cross-framework scaling. The findings offer a principled way to balance data quantity and quality to achieve near-optimal SR performance under real-world resource constraints.
Abstract
Scaling Laws have emerged as a powerful framework for understanding how model performance evolves as they increase in size, providing valuable insights for optimizing computational resources. In the realm of Sequential Recommendation (SR), which is pivotal for predicting users' sequential preferences, these laws offer a lens through which to address the challenges posed by the scalability of SR models. However, the presence of structural and collaborative issues in recommender systems prevents the direct application of the Scaling Law (SL) in these systems. In response, we introduce the Performance Law for SR models, which aims to theoretically investigate and model the relationship between model performance and data quality. Specifically, we first fit the HR and NDCG metrics to transformer-based SR models. Subsequently, we propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics. Our method enables accurate predictions across various dataset scales and model sizes, demonstrating a strong correlation in large SR models and offering insights into achieving optimal performance for any given model configuration.
