Inference Computation Scaling for Feature Augmentation in Recommendation Systems
Weihao Liu, Zhaocheng Du, Haiyuan Zhao, Wenbo Zhang, Xiaoyan Zhao, Gang Wang, Zhenhua Dong, Jun Xu
TL;DR
The paper addresses incomplete feature coverage and shallow descriptions in LLM-based feature augmentation for recommendations by applying inference scaling with extended Chain-of-Thought (long-CoT) reasoning. It treats feature generation as a scalable inference task with a policy model, a reward model, and search strategies (e.g., Best-of-N) to produce richer, more diverse features, achieving a $12\%$ improvement in NDCG@10 on benchmark datasets. Key contributions include (1) demonstrating inference scaling for recommendation feature augmentation, (2) linking gains to increased feature quantity and specificity, (3) analyzing how policy-model choice and search strategy affect outcomes, and (4) showing transfer of long-CoT benefits from math and coding to personalized recommendation. The findings suggest that longer reasoning and carefully chosen search procedures can significantly improve personalization by capturing nuanced user preferences, albeit with higher computational costs and with limitations that warrant further theoretical and industrial-scale study.
Abstract
Large language models have become a powerful method for feature augmentation in recommendation systems. However, existing approaches relying on quick inference often suffer from incomplete feature coverage and insufficient specificity in feature descriptions, limiting their ability to capture fine-grained user preferences and undermining overall performance. Motivated by the recent success of inference scaling in math and coding tasks, we explore whether scaling inference can address these limitations and enhance feature quality. Our experiments show that scaling inference leads to significant improvements in recommendation performance, with a 12% increase in NDCG@10. The gains can be attributed to two key factors: feature quantity and specificity. In particular, models using extended Chain-of-Thought (CoT) reasoning generate a greater number of detailed and precise features, offering deeper insights into user preferences and overcoming the limitations of quick inference. We further investigate the factors influencing feature quantity, revealing that model choice and search strategy play critical roles in generating a richer and more diverse feature set. This is the first work to apply inference scaling to feature augmentation in recommendation systems, bridging advances in reasoning tasks to enhance personalized recommendation.
