Table of Contents
Fetching ...

ID-centric Pre-training for Recommendation

Yiqing Wu, Ruobing Xie, Zhao Zhang, Fuzhen Zhuang, Xu Zhang, Leyu Lin, Zhanhui Kang, Yongjun Xu

TL;DR

The paper tackles cross-domain transferability in sequential recommendation by proposing ID-centric pre-training (IDP), which transfers pre-trained item ID embeddings across domains via a Cross-domain ID Matcher (CDIM) that leverages both textual and behavioral cues. IDP preserves ID-based representations for downstream recommendation, while using textual modality as a bridge to connect pre-training IDs to new items, enabling effective initialization and optional fine-tuning in target domains. Empirical results on nine Amazon datasets demonstrate consistent gains over strong baselines, with strong universality across base models and even zero-shot improvements, and show the framework can extend to multi-modal information. The approach offers a practical, scalable alternative to heavy PLM-based methods and highlights a path toward universal cross-domain ID transfer in industry-scale recommender systems.

Abstract

Classical sequential recommendation models generally adopt ID embeddings to store knowledge learned from user historical behaviors and represent items. However, these unique IDs are challenging to be transferred to new domains. With the thriving of pre-trained language model (PLM), some pioneer works adopt PLM for pre-trained recommendation, where modality information (e.g., text) is considered universal across domains via PLM. Unfortunately, the behavioral information in ID embeddings is still verified to be dominating in PLM-based recommendation models compared to modality information and thus limits these models' performance. In this work, we propose a novel ID-centric recommendation pre-training paradigm (IDP), which directly transfers informative ID embeddings learned in pre-training domains to item representations in new domains. Specifically, in pre-training stage, besides the ID-based sequential model for recommendation, we also build a Cross-domain ID-matcher (CDIM) learned by both behavioral and modality information. In the tuning stage, modality information of new domain items is regarded as a cross-domain bridge built by CDIM. We first leverage the textual information of downstream domain items to retrieve behaviorally and semantically similar items from pre-training domains using CDIM. Next, these retrieved pre-trained ID embeddings, rather than certain textual embeddings, are directly adopted to generate downstream new items' embeddings. Through extensive experiments on real-world datasets, both in cold and warm settings, we demonstrate that our proposed model significantly outperforms all baselines. Codes will be released upon acceptance.

ID-centric Pre-training for Recommendation

TL;DR

The paper tackles cross-domain transferability in sequential recommendation by proposing ID-centric pre-training (IDP), which transfers pre-trained item ID embeddings across domains via a Cross-domain ID Matcher (CDIM) that leverages both textual and behavioral cues. IDP preserves ID-based representations for downstream recommendation, while using textual modality as a bridge to connect pre-training IDs to new items, enabling effective initialization and optional fine-tuning in target domains. Empirical results on nine Amazon datasets demonstrate consistent gains over strong baselines, with strong universality across base models and even zero-shot improvements, and show the framework can extend to multi-modal information. The approach offers a practical, scalable alternative to heavy PLM-based methods and highlights a path toward universal cross-domain ID transfer in industry-scale recommender systems.

Abstract

Classical sequential recommendation models generally adopt ID embeddings to store knowledge learned from user historical behaviors and represent items. However, these unique IDs are challenging to be transferred to new domains. With the thriving of pre-trained language model (PLM), some pioneer works adopt PLM for pre-trained recommendation, where modality information (e.g., text) is considered universal across domains via PLM. Unfortunately, the behavioral information in ID embeddings is still verified to be dominating in PLM-based recommendation models compared to modality information and thus limits these models' performance. In this work, we propose a novel ID-centric recommendation pre-training paradigm (IDP), which directly transfers informative ID embeddings learned in pre-training domains to item representations in new domains. Specifically, in pre-training stage, besides the ID-based sequential model for recommendation, we also build a Cross-domain ID-matcher (CDIM) learned by both behavioral and modality information. In the tuning stage, modality information of new domain items is regarded as a cross-domain bridge built by CDIM. We first leverage the textual information of downstream domain items to retrieve behaviorally and semantically similar items from pre-training domains using CDIM. Next, these retrieved pre-trained ID embeddings, rather than certain textual embeddings, are directly adopted to generate downstream new items' embeddings. Through extensive experiments on real-world datasets, both in cold and warm settings, we demonstrate that our proposed model significantly outperforms all baselines. Codes will be released upon acceptance.
Paper Structure (36 sections, 19 equations, 4 figures, 10 tables, 1 algorithm)

This paper contains 36 sections, 19 equations, 4 figures, 10 tables, 1 algorithm.

Figures (4)

  • Figure 1: A example of ID-centric pre-training recommendation. The central idea is to select related pre-training ID embeddings to generate item embeddings in new domains.
  • Figure 2: Illustrations of our IDP and other pre-trained models. Different from conventional pre-trained models that use texts (representations) as items for behavior modeling, our IDP directly uses pre-trained ID embeddings to generate new items.
  • Figure 3: Overall architecture of our IDP.
  • Figure 5: Ablation studies on Arts and Office datasets. All components of IDP are effective.