Cross Domain LifeLong Sequential Modeling for Online Click-Through Rate Prediction

Ruijie Hou; Zhaoyang Yang; Yu Ming; Hongyu Lu; Zhuobin Zheng; Yu Chen; Qinsong Zeng; Ming Chen

Cross Domain LifeLong Sequential Modeling for Online Click-Through Rate Prediction

Ruijie Hou, Zhaoyang Yang, Yu Ming, Hongyu Lu, Zhuobin Zheng, Yu Chen, Qinsong Zeng, Ming Chen

TL;DR

This work tackles cross-domain CTR prediction under lifelong sequential modeling by proposing the Lifelong Cross Network (LCN). LCN combines a Cross Representation Production (CRP) module, which uses contrastive learning to align item embeddings across source and target domains, with a Lifelong Attention Pyramid (LAP) that progressively extracts interest representations from lifelong sequences via CSA, MSA, and FSA levels. The overall objective combines the standard CTR loss ${L}_{CTR}$ with the CRP loss ${L}_{CRP}$ as $L = {L}_{CTR} + \lambda_{CRP}{L}_{CRP}$, enabling end-to-end optimization. Experiments on Taobao and WeChat Channels—both offline metrics (AUC, GAUC, logloss) and online metrics (CTR, stay time, latency)—demonstrate that LCN improves predictive accuracy and online performance, with notable gains in cross-domain live recommendations and robust applicability across backbones.

Abstract

Deep neural networks (DNNs) that incorporated lifelong sequential modeling (LSM) have brought great success to recommendation systems in various social media platforms. While continuous improvements have been made in domain-specific LSM, limited work has been done in cross-domain LSM, which considers modeling of lifelong sequences of both target domain and source domain. In this paper, we propose Lifelong Cross Network (LCN) to incorporate cross-domain LSM to improve the click-through rate (CTR) prediction in the target domain. The proposed LCN contains a LifeLong Attention Pyramid (LAP) module that comprises of three levels of cascaded attentions to effectively extract interest representations with respect to the candidate item from lifelong sequences. We also propose Cross Representation Production (CRP) module to enforce additional supervision on the learning and alignment of cross-domain representations so that they can be better reused on learning of the CTR prediction in the target domain. We conducted extensive experiments on WeChat Channels industrial dataset as well as on benchmark dataset. Results have revealed that the proposed LCN outperforms existing work in terms of both prediction accuracy and online performance.

Cross Domain LifeLong Sequential Modeling for Online Click-Through Rate Prediction

TL;DR

with the CRP loss

, enabling end-to-end optimization. Experiments on Taobao and WeChat Channels—both offline metrics (AUC, GAUC, logloss) and online metrics (CTR, stay time, latency)—demonstrate that LCN improves predictive accuracy and online performance, with notable gains in cross-domain live recommendations and robust applicability across backbones.

Abstract

Paper Structure (23 sections, 14 equations, 5 figures, 5 tables)

This paper contains 23 sections, 14 equations, 5 figures, 5 tables.

Introduction
RELATED WORK
PRELIMINARIES
METHODOLOGY
Cross Representation Production
Positive and Negative Sampling.
Loss Function
Lifelong Attention Pyramid
The Complete-Scope Attention (CSA)
The Median-Scope Attention (MSA)
The Focused-Scope Attention (FSA)
EXPERIMENTS
Experimental Settings
Datasets
Competitors
...and 8 more sections

Figures (5)

Figure 1: (a) A showcase of video content and live content in Wechat Channels platform. (b) A comparison between the statistics of the length of behavior sequence of video content and live content of users in Wechat Channels platform.
Figure 2: An overview of the proposed Lifelong Cross Network (LCN). There are two major components in the model: (i) The Cross Representation Production (CRP) module is a jointly trained sub-network to learn item embeddings capable of identifying similar items across domains. (ii) The Lifelong Attention Pyramid (LAP) module is composed of three levels of cascading attentions. These attentions are designed to progressively extract interest representations with respect to candidate items from the cross-domain lifelong sequence.
Figure 3: An illustration of the Cross Representation Production (CRP) module. It samples three distinct types of positive pairs and negative pairs. We introduce a contrastive loss to reduce the cosine distance between items in positive pairs while expand those of negative pairs.
Figure 4: Visualization of the item embeddings from $LCN\ w/o\ CRP$ and $LCN$.
Figure 5: Comparisons of the consistencies between GSU & ESU (a) of different LSM methods and (b) under different settings of $K_1$.

Cross Domain LifeLong Sequential Modeling for Online Click-Through Rate Prediction

TL;DR

Abstract

Cross Domain LifeLong Sequential Modeling for Online Click-Through Rate Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (5)