Table of Contents
Fetching ...

Moment&Cross: Next-Generation Real-Time Cross-Domain CTR Prediction for Live-Streaming Recommendation at Kuaishou

Jiangxia Cao, Shen Wang, Yue Li, Shenghui Wang, Jian Tang, Shiyao Wang, Shuang Yang, Zhaojie Liu, Guorui Zhou

TL;DR

Moment&Cross tackles real-time cross-domain CTR prediction for live-streaming on Kuaishou by addressing temporal content shifts, feedback delays, and data sparsity through a dual mechanism: Moment enables 30-second real-time data streaming with first-only label-mask learning, while Cross transfers rich short-video interests via General and Exact Search Units and contrastive alignment. The framework combines fast real-time feedback with cross-domain history to improve predictions, using multi-task ranking and a tailored loss design ($\mathcal{L}_{fast}$, $\mathcal{L}_{slow}$, $\mathcal{L}_{moment}$) and a composite Ranking_Score, together with cross-domain contrastive objectives. Offline results show consistent gains in AUC/GAUC and ablations confirm the value of each component, and online A/B tests demonstrate improvements in Click and Watch Time, with notable benefits for low-gift users, validating real-time adaptation and cross-domain transfer at scale. The approach is deployed to serve 400 million users, illustrating practical impact in live-streaming recommendations where identifying high-light moments and surfacing related live-streams are critical.

Abstract

Kuaishou, is one of the largest short-video and live-streaming platform, compared with short-video recommendations, live-streaming recommendation is more complex because of: (1) temporarily-alive to distribution, (2) user may watch for a long time with feedback delay, (3) content is unpredictable and changes over time. Actually, even if a user is interested in the live-streaming author, it still may be an negative watching (e.g., short-view < 3s) since the real-time content is not attractive enough. Therefore, for live-streaming recommendation, there exists a challenging task: how do we recommend the live-streaming at right moment for users? Additionally, our platform's major exposure content is short short-video, and the amount of exposed short-video is 9x more than exposed live-streaming. Thus users will leave more behaviors on short-videos, which leads to a serious data imbalance problem making the live-streaming data could not fully reflect user interests. In such case, there raises another challenging task: how do we utilize users' short-video behaviors to make live-streaming recommendation better?

Moment&Cross: Next-Generation Real-Time Cross-Domain CTR Prediction for Live-Streaming Recommendation at Kuaishou

TL;DR

Moment&Cross tackles real-time cross-domain CTR prediction for live-streaming on Kuaishou by addressing temporal content shifts, feedback delays, and data sparsity through a dual mechanism: Moment enables 30-second real-time data streaming with first-only label-mask learning, while Cross transfers rich short-video interests via General and Exact Search Units and contrastive alignment. The framework combines fast real-time feedback with cross-domain history to improve predictions, using multi-task ranking and a tailored loss design (, , ) and a composite Ranking_Score, together with cross-domain contrastive objectives. Offline results show consistent gains in AUC/GAUC and ablations confirm the value of each component, and online A/B tests demonstrate improvements in Click and Watch Time, with notable benefits for low-gift users, validating real-time adaptation and cross-domain transfer at scale. The approach is deployed to serve 400 million users, illustrating practical impact in live-streaming recommendations where identifying high-light moments and surfacing related live-streams are critical.

Abstract

Kuaishou, is one of the largest short-video and live-streaming platform, compared with short-video recommendations, live-streaming recommendation is more complex because of: (1) temporarily-alive to distribution, (2) user may watch for a long time with feedback delay, (3) content is unpredictable and changes over time. Actually, even if a user is interested in the live-streaming author, it still may be an negative watching (e.g., short-view < 3s) since the real-time content is not attractive enough. Therefore, for live-streaming recommendation, there exists a challenging task: how do we recommend the live-streaming at right moment for users? Additionally, our platform's major exposure content is short short-video, and the amount of exposed short-video is 9x more than exposed live-streaming. Thus users will leave more behaviors on short-videos, which leads to a serious data imbalance problem making the live-streaming data could not fully reflect user interests. In such case, there raises another challenging task: how do we utilize users' short-video behaviors to make live-streaming recommendation better?
Paper Structure (13 sections, 7 equations, 5 figures, 4 tables)

This paper contains 13 sections, 7 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Typical live-streaming pattern between the 'high-light moment', real-time audience and the CTR trends.
  • Figure 2: The Slide page RecSys architecture of Kuaishou short-video and live-streaming services, different services are separately with their own data-streaming and model. The only way to access users' other business logs, is by utilizing the 'interaction logs' storage services to retrospect historical user's short-video behaviors to find related small group items.
  • Figure 3: The report samples difference of produced training samples between fast-slow 5-min&1-hour data-steaming and real-time 30s data-steaming. We only show the simplest sample format (user, live-streaming, click, long-view, like, comment, gift). Specifically, for the fast-slow data-streaming, the fast flow reports 5-min window observed all user behaviors, the slow flow reports 5-min missing but 1-hour observed positive user behaviors. In our real-time data-streaming, we report users' first positive behaviors immediately every 30 seconds and report all negative behavior when user exit a live-streaming. According to the report samples' indicative relationship, our real-time data-streaming could produce training samples as soon as possible, to encourage model capturing the CTR increasing trends live-streaming.
  • Figure 4: We introduce 5 short-video and live-streaming searched sequences to support our model: (1) we first conduct the contrastive mechanism to align them embedding space, (2) we then utilize the target attention mechanism to extract users' interests. (3) we finally concatenate the cross-domain short-video signal to predict each behavior probability.
  • Figure 5: (a) Moment could enhance our model to perceive the high-light live-streamings; (b) Short-video cross-domain interests could help our system find related live-streamings.