Table of Contents
Fetching ...

TransRec: Learning Transferable Recommendation from Mixture-of-Modality Feedback

Jie Wang, Fajie Yuan, Mingyue Cheng, Joemon M. Jose, Chenyun Yu, Beibei Kong, Zhijin Wang, Bo Hu, Zang Li

TL;DR

TransRec introduces a modality-based, end-to-end recommender that learns from mixture-of-modality feedback to achieve cross-domain transfer without overlapping users/items. By replacing item IDs with modality encoders (text via BERT, image via ResNet) and using a Transformer-based two-tower DSSM, it enables large-scale pre-training on MoM data and fine-tuning on diverse target domains. Empirical results show strong transferability across four downstream settings and highlight the benefits of end-to-end training and data scaling, despite high computational cost. Overall, the approach points toward universal recommender systems by leveraging multimodal content akin to foundation models in other domains.

Abstract

Learning large-scale pre-trained models on broad-ranging data and then transfer to a wide range of target tasks has become the de facto paradigm in many machine learning (ML) communities. Such big models are not only strong performers in practice but also offer a promising way to break out of the task-specific modeling restrictions, thereby enabling task-agnostic and unified ML systems. However, such a popular paradigm is mainly unexplored by the recommender systems (RS) community. A critical issue is that standard recommendation models are primarily built on categorical identity features. That is, the users and the interacted items are represented by their unique IDs, which are generally not shareable across different systems or platforms. To pursue the transferable recommendations, we propose studying pre-trained RS models in a novel scenario where a user's interaction feedback involves a mixture-of-modality (MoM) items, e.g., text and images. We then present TransRec, a very simple modification made on the popular ID-based RS framework. TransRec learns directly from the raw features of the MoM items in an end-to-end training manner and thus enables effective transfer learning under various scenarios without relying on overlapped users or items. We empirically study the transferring ability of TransRec across four different real-world recommendation settings. Besides, we look at its effects by scaling source and target data size. Our results suggest that learning neural recommendation models from MoM feedback provides a promising way to realize universal RS.

TransRec: Learning Transferable Recommendation from Mixture-of-Modality Feedback

TL;DR

TransRec introduces a modality-based, end-to-end recommender that learns from mixture-of-modality feedback to achieve cross-domain transfer without overlapping users/items. By replacing item IDs with modality encoders (text via BERT, image via ResNet) and using a Transformer-based two-tower DSSM, it enables large-scale pre-training on MoM data and fine-tuning on diverse target domains. Empirical results show strong transferability across four downstream settings and highlight the benefits of end-to-end training and data scaling, despite high computational cost. Overall, the approach points toward universal recommender systems by leveraging multimodal content akin to foundation models in other domains.

Abstract

Learning large-scale pre-trained models on broad-ranging data and then transfer to a wide range of target tasks has become the de facto paradigm in many machine learning (ML) communities. Such big models are not only strong performers in practice but also offer a promising way to break out of the task-specific modeling restrictions, thereby enabling task-agnostic and unified ML systems. However, such a popular paradigm is mainly unexplored by the recommender systems (RS) community. A critical issue is that standard recommendation models are primarily built on categorical identity features. That is, the users and the interacted items are represented by their unique IDs, which are generally not shareable across different systems or platforms. To pursue the transferable recommendations, we propose studying pre-trained RS models in a novel scenario where a user's interaction feedback involves a mixture-of-modality (MoM) items, e.g., text and images. We then present TransRec, a very simple modification made on the popular ID-based RS framework. TransRec learns directly from the raw features of the MoM items in an end-to-end training manner and thus enables effective transfer learning under various scenarios without relying on overlapped users or items. We empirically study the transferring ability of TransRec across four different real-world recommendation settings. Besides, we look at its effects by scaling source and target data size. Our results suggest that learning neural recommendation models from MoM feedback provides a promising way to realize universal RS.
Paper Structure (13 sections, 7 equations, 4 figures, 9 tables)

This paper contains 13 sections, 7 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Schematic of learning from MoM. TransRec first pre-trains a unified recommendation model with MoM feedback in the source domain and then serves any target domain as long as the item's modality type is contained in the MoM feedback.
  • Figure 2: Illustration of the training process of TransRec. Here, the inner product is employed to compute the preference between users and candidate items.
  • Figure 3: Convergence trend by scaling the source data.
  • Figure 4: Comparison of convergence by scaling TN-mixed dataset.