Table of Contents
Fetching ...

Large Language Model Simulator for Cold-Start Recommendation

Feiran Huang, Yuanchen Bei, Zhenghang Yang, Junyi Jiang, Hao Chen, Qijie Shen, Senzhang Wang, Fakhri Karray, Philip S. Yu

TL;DR

This work tackles the cold-start problem in billion-scale recommender systems by introducing ColdLLM, an LLM-based simulator that generates plausible user interactions for cold items. A coupled funnel framework efficiently narrows the candidate user set to hundreds, enabling scalable behavior simulation and embedding optimization for cold items. The authors jointly train a specialized LLM with a coupled filter to produce high-quality simulated data, demonstrating substantial offline gains in cold-start metrics and a two-week online GMV uplift in a real deployment. These results suggest that leveraging world knowledge in LLMs, combined with efficient filtering and refinement, can significantly mitigate cold-start challenges in large-scale recommender systems with real-world impact.

Abstract

Recommending cold items remains a significant challenge in billion-scale online recommendation systems. While warm items benefit from historical user behaviors, cold items rely solely on content features, limiting their recommendation performance and impacting user experience and revenue. Current models generate synthetic behavioral embeddings from content features but fail to address the core issue: the absence of historical behavior data. To tackle this, we introduce the LLM Simulator framework, which leverages large language models to simulate user interactions for cold items, fundamentally addressing the cold-start problem. However, simply using LLM to traverse all users can introduce significant complexity in billion-scale systems. To manage the computational complexity, we propose a coupled funnel ColdLLM framework for online recommendation. ColdLLM efficiently reduces the number of candidate users from billions to hundreds using a trained coupled filter, allowing the LLM to operate efficiently and effectively on the filtered set. Extensive experiments show that ColdLLM significantly surpasses baselines in cold-start recommendations, including Recall and NDCG metrics. A two-week A/B test also validates that ColdLLM can effectively increase the cold-start period GMV.

Large Language Model Simulator for Cold-Start Recommendation

TL;DR

This work tackles the cold-start problem in billion-scale recommender systems by introducing ColdLLM, an LLM-based simulator that generates plausible user interactions for cold items. A coupled funnel framework efficiently narrows the candidate user set to hundreds, enabling scalable behavior simulation and embedding optimization for cold items. The authors jointly train a specialized LLM with a coupled filter to produce high-quality simulated data, demonstrating substantial offline gains in cold-start metrics and a two-week online GMV uplift in a real deployment. These results suggest that leveraging world knowledge in LLMs, combined with efficient filtering and refinement, can significantly mitigate cold-start challenges in large-scale recommender systems with real-world impact.

Abstract

Recommending cold items remains a significant challenge in billion-scale online recommendation systems. While warm items benefit from historical user behaviors, cold items rely solely on content features, limiting their recommendation performance and impacting user experience and revenue. Current models generate synthetic behavioral embeddings from content features but fail to address the core issue: the absence of historical behavior data. To tackle this, we introduce the LLM Simulator framework, which leverages large language models to simulate user interactions for cold items, fundamentally addressing the cold-start problem. However, simply using LLM to traverse all users can introduce significant complexity in billion-scale systems. To manage the computational complexity, we propose a coupled funnel ColdLLM framework for online recommendation. ColdLLM efficiently reduces the number of candidate users from billions to hundreds using a trained coupled filter, allowing the LLM to operate efficiently and effectively on the filtered set. Extensive experiments show that ColdLLM significantly surpasses baselines in cold-start recommendations, including Recall and NDCG metrics. A two-week A/B test also validates that ColdLLM can effectively increase the cold-start period GMV.
Paper Structure (40 sections, 15 equations, 5 figures, 3 tables)

This paper contains 40 sections, 15 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: A comparison between traditional item cold-start models and our ColdLLM.
  • Figure 2: The overall model architecture of the proposed ColdLLM.
  • Figure 3: The system architecture for ColdLLM deployment.
  • Figure 4: Ablation study results on CiteULike.
  • Figure 5: Parameter study results on CiteULike.