PPM : A Pre-trained Plug-in Model for Click-through Rate Prediction

Yuanbo Gao; Peng Lin; Dongyue Wang; Feng Mei; Xiwei Zhao; Sulong Xu; Jinghe Hu

PPM : A Pre-trained Plug-in Model for Click-through Rate Prediction

Yuanbo Gao, Peng Lin, Dongyue Wang, Feng Mei, Xiwei Zhao, Sulong Xu, Jinghe Hu

TL;DR

A pre-trained plug-in CTR model, namely PPM, which employs multi-modal features as input and utilizes large-scale data for pre-training and is plugged in IDRec model to enhance unified model's performance and iteration efficiency.

Abstract

Click-through rate (CTR) prediction is a core task in recommender systems. Existing methods (IDRec for short) rely on unique identities to represent distinct users and items that have prevailed for decades. On one hand, IDRec often faces significant performance degradation on cold-start problem; on the other hand, IDRec cannot use longer training data due to constraints imposed by iteration efficiency. Most prior studies alleviate the above problems by introducing pre-trained knowledge(e.g. pre-trained user model or multi-modal embeddings). However, the explosive growth of online latency can be attributed to the huge parameters in the pre-trained model. Therefore, most of them cannot employ the unified model of end-to-end training with IDRec in industrial recommender systems, thus limiting the potential of the pre-trained model. To this end, we propose a $\textbf{P}$re-trained $\textbf{P}$lug-in CTR $\textbf{M}$odel, namely PPM. PPM employs multi-modal features as input and utilizes large-scale data for pre-training. Then, PPM is plugged in IDRec model to enhance unified model's performance and iteration efficiency. Upon incorporating IDRec model, certain intermediate results within the network are cached, with only a subset of the parameters participating in training and serving. Hence, our approach can successfully deploy an end-to-end model without causing huge latency increases. Comprehensive offline experiments and online A/B testing at JD E-commerce demonstrate the efficiency and effectiveness of PPM.

PPM : A Pre-trained Plug-in Model for Click-through Rate Prediction

TL;DR

Abstract

re-trained

lug-in CTR

odel, namely PPM. PPM employs multi-modal features as input and utilizes large-scale data for pre-training. Then, PPM is plugged in IDRec model to enhance unified model's performance and iteration efficiency. Upon incorporating IDRec model, certain intermediate results within the network are cached, with only a subset of the parameters participating in training and serving. Hence, our approach can successfully deploy an end-to-end model without causing huge latency increases. Comprehensive offline experiments and online A/B testing at JD E-commerce demonstrate the efficiency and effectiveness of PPM.

Paper Structure (25 sections, 10 equations, 3 figures, 3 tables)

This paper contains 25 sections, 10 equations, 3 figures, 3 tables.

Introduction
Related Work
ID-based Recommendation
Pre-trained Modal Embedding
Pre-trained User Model
METHODS
Pre-trained CTR Model
Modality Encoder Layer
Behavior-Transformer Layer
CTR Prediction Layer
Unified Ranking Model
ID-based Sequential Module
PPM
Multi-task Module
Experiments
...and 10 more sections

Figures (3)

Figure 1: Architecture of the proposed PPM and Unified Ranking Model (URM).
Figure 2: The performances of our proposed URM and Base model in cold-start and warm settings. The shaded region in red represents the observed increase in $\overline{AUC}$ for URM compared to the Base Model.
Figure 3: The offline training and incremental update processes of PPM and URM.

PPM : A Pre-trained Plug-in Model for Click-through Rate Prediction

TL;DR

Abstract

PPM : A Pre-trained Plug-in Model for Click-through Rate Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (3)