FilterLLM: Text-To-Distribution LLM for Billion-Scale Cold-Start Recommendation
Ruochen Liu, Hao Chen, Yuanchen Bei, Zheyu Zhou, Lijia Chen, Qijie Shen, Feiran Huang, Fakhri Karray, Senzhang Wang
TL;DR
This work tackles the core bottleneck of LLM-based cold-start recommendations at billion-scale by replacing the iterative Text-to-Judgement paradigm with a single forward-pass Text-to-Distribution approach. FilterLLM expands the LLM vocabulary to model billions of users, encodes item content into a neural representation, and predicts a full user distribution p(u|c) in one pass, subsequently updating cold-item embeddings from sampled interactions. The model is trained with a combination of collaborative-driven vocabulary initialization, distribution learning, and behavior-guiding losses, while keeping the backbone LLM fixed and fine-tuning only lightweight adapters. Demonstrated on Alibaba’s platform, FilterLLM yields order-of-magnitude efficiency gains and notable cold-start improvements in offline benchmarks and online A/B tests, indicating substantial practical impact for large-scale recommender systems.
Abstract
Large Language Model (LLM)-based cold-start recommendation systems continue to face significant computational challenges in billion-scale scenarios, as they follow a "Text-to-Judgment" paradigm. This approach processes user-item content pairs as input and evaluates each pair iteratively. To maintain efficiency, existing methods rely on pre-filtering a small candidate pool of user-item pairs. However, this severely limits the inferential capabilities of LLMs by reducing their scope to only a few hundred pre-filtered candidates. To overcome this limitation, we propose a novel "Text-to-Distribution" paradigm, which predicts an item's interaction probability distribution for the entire user set in a single inference. Specifically, we present FilterLLM, a framework that extends the next-word prediction capabilities of LLMs to billion-scale filtering tasks. FilterLLM first introduces a tailored distribution prediction and cold-start framework. Next, FilterLLM incorporates an efficient user-vocabulary structure to train and store the embeddings of billion-scale users. Finally, we detail the training objectives for both distribution prediction and user-vocabulary construction. The proposed framework has been deployed on the Alibaba platform, where it has been serving cold-start recommendations for two months, processing over one billion cold items. Extensive experiments demonstrate that FilterLLM significantly outperforms state-of-the-art methods in cold-start recommendation tasks, achieving over 30 times higher efficiency. Furthermore, an online A/B test validates its effectiveness in billion-scale recommendation systems.
