NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation
Jiaqi Zhang, Yu Cheng, Yongxin Ni, Yunzhu Pan, Zheng Yuan, Junchen Fu, Youhua Li, Jie Wang, Fajie Yuan
TL;DR
NineRec introduces a large-scale TransRec benchmark suite to advance transferable recommendation by learning directly from raw multimodal item content. The dataset combines a substantial source domain (Bili_2M) with nine diverse target domains, each providing textual descriptions and cover images, enabling end-to-end multimodal learning. Extensive experiments compare MoRec and TransRec baselines, highlighting that end-to-end training with modality encoders generally outperforms two-stage approaches and that cross-domain pre-training enhances downstream transfer, albeit with substantial computational costs. NineRec thereby offers a public platform for evaluating transferability, benchmarking architectures, and fostering cross-pollination between recommender systems, NLP, and computer vision. The work also addresses privacy and copyright considerations and provides a pathway for future improvements through larger pre-training scales and optimized UE/ME designs.
Abstract
Large foundational models, through upstream pre-training and downstream fine-tuning, have achieved immense success in the broad AI community due to improved model performance and significant reductions in repetitive engineering. By contrast, the transferable one-for-all models in the recommender system field, referred to as TransRec, have made limited progress. The development of TransRec has encountered multiple challenges, among which the lack of large-scale, high-quality transfer learning recommendation dataset and benchmark suites is one of the biggest obstacles. To this end, we introduce NineRec, a TransRec dataset suite that comprises a large-scale source domain recommendation dataset and nine diverse target domain recommendation datasets. Each item in NineRec is accompanied by a descriptive text and a high-resolution cover image. Leveraging NineRec, we enable the implementation of TransRec models by learning from raw multimodal features instead of relying solely on pre-extracted off-the-shelf features. Finally, we present robust TransRec benchmark results with several classical network architectures, providing valuable insights into the field. To facilitate further research, we will release our code, datasets, benchmarks, and leaderboards at https://github.com/westlake-repl/NineRec.
