ID-centric Pre-training for Recommendation
Yiqing Wu, Ruobing Xie, Zhao Zhang, Fuzhen Zhuang, Xu Zhang, Leyu Lin, Zhanhui Kang, Yongjun Xu
TL;DR
The paper tackles cross-domain transferability in sequential recommendation by proposing ID-centric pre-training (IDP), which transfers pre-trained item ID embeddings across domains via a Cross-domain ID Matcher (CDIM) that leverages both textual and behavioral cues. IDP preserves ID-based representations for downstream recommendation, while using textual modality as a bridge to connect pre-training IDs to new items, enabling effective initialization and optional fine-tuning in target domains. Empirical results on nine Amazon datasets demonstrate consistent gains over strong baselines, with strong universality across base models and even zero-shot improvements, and show the framework can extend to multi-modal information. The approach offers a practical, scalable alternative to heavy PLM-based methods and highlights a path toward universal cross-domain ID transfer in industry-scale recommender systems.
Abstract
Classical sequential recommendation models generally adopt ID embeddings to store knowledge learned from user historical behaviors and represent items. However, these unique IDs are challenging to be transferred to new domains. With the thriving of pre-trained language model (PLM), some pioneer works adopt PLM for pre-trained recommendation, where modality information (e.g., text) is considered universal across domains via PLM. Unfortunately, the behavioral information in ID embeddings is still verified to be dominating in PLM-based recommendation models compared to modality information and thus limits these models' performance. In this work, we propose a novel ID-centric recommendation pre-training paradigm (IDP), which directly transfers informative ID embeddings learned in pre-training domains to item representations in new domains. Specifically, in pre-training stage, besides the ID-based sequential model for recommendation, we also build a Cross-domain ID-matcher (CDIM) learned by both behavioral and modality information. In the tuning stage, modality information of new domain items is regarded as a cross-domain bridge built by CDIM. We first leverage the textual information of downstream domain items to retrieve behaviorally and semantically similar items from pre-training domains using CDIM. Next, these retrieved pre-trained ID embeddings, rather than certain textual embeddings, are directly adopted to generate downstream new items' embeddings. Through extensive experiments on real-world datasets, both in cold and warm settings, we demonstrate that our proposed model significantly outperforms all baselines. Codes will be released upon acceptance.
