Table of Contents
Fetching ...

Pacer and Runner: Cooperative Learning Framework between Single- and Cross-Domain Sequential Recommendation

Chung Park, Taesan Kim, Hyungjun Yoon, Junui Hong, Yelim Yu, Mincheol Cho, Minsung Choi, Jaegul Choo

TL;DR

This work tackles negative transfer in Cross-Domain Sequential Recommendation by proposing SyNCRec, an asymmetric cooperative framework that jointly learns SDSR and CDSR while estimating a Negative Transfer Gap per domain and reweighting losses to limit harmful gradient flow. It introduces an ACMoE architecture with decoupled expert paths for SDSR and CDSR and a loss-correction mechanism, plus a Single-Cross Mutual Information Maximization objective to promote beneficial cross-domain cue transfer. Empirically, SyNCRec outperforms 25 state-of-the-art baselines across two real-world datasets and ten domains, and online deployment yields substantial business value with notable CTR improvements. The approach provides a scalable, multi-domain solution for industrial recommender systems, balancing cross-domain knowledge transfer with domain-specific performance.

Abstract

Cross-Domain Sequential Recommendation (CDSR) improves recommendation performance by utilizing information from multiple domains, which contrasts with Single-Domain Sequential Recommendation (SDSR) that relies on a historical interaction within a specific domain. However, CDSR may underperform compared to the SDSR approach in certain domains due to negative transfer, which occurs when there is a lack of relation between domains or different levels of data sparsity. To address the issue of negative transfer, our proposed CDSR model estimates the degree of negative transfer of each domain and adaptively assigns it as a weight factor to the prediction loss, to control gradient flows through domains with significant negative transfer. To this end, our model compares the performance of a model trained on multiple domains (CDSR) with a model trained solely on the specific domain (SDSR) to evaluate the negative transfer of each domain using our asymmetric cooperative network. In addition, to facilitate the transfer of valuable cues between the SDSR and CDSR tasks, we developed an auxiliary loss that maximizes the mutual information between the representation pairs from both tasks on a per-domain basis. This cooperative learning between SDSR and CDSR tasks is similar to the collaborative dynamics between pacers and runners in a marathon. Our model outperformed numerous previous works in extensive experiments on two real-world industrial datasets across ten service domains. We also have deployed our model in the recommendation system of our personal assistant app service, resulting in 21.4% increase in click-through rate compared to existing models, which is valuable to real-world business.

Pacer and Runner: Cooperative Learning Framework between Single- and Cross-Domain Sequential Recommendation

TL;DR

This work tackles negative transfer in Cross-Domain Sequential Recommendation by proposing SyNCRec, an asymmetric cooperative framework that jointly learns SDSR and CDSR while estimating a Negative Transfer Gap per domain and reweighting losses to limit harmful gradient flow. It introduces an ACMoE architecture with decoupled expert paths for SDSR and CDSR and a loss-correction mechanism, plus a Single-Cross Mutual Information Maximization objective to promote beneficial cross-domain cue transfer. Empirically, SyNCRec outperforms 25 state-of-the-art baselines across two real-world datasets and ten domains, and online deployment yields substantial business value with notable CTR improvements. The approach provides a scalable, multi-domain solution for industrial recommender systems, balancing cross-domain knowledge transfer with domain-specific performance.

Abstract

Cross-Domain Sequential Recommendation (CDSR) improves recommendation performance by utilizing information from multiple domains, which contrasts with Single-Domain Sequential Recommendation (SDSR) that relies on a historical interaction within a specific domain. However, CDSR may underperform compared to the SDSR approach in certain domains due to negative transfer, which occurs when there is a lack of relation between domains or different levels of data sparsity. To address the issue of negative transfer, our proposed CDSR model estimates the degree of negative transfer of each domain and adaptively assigns it as a weight factor to the prediction loss, to control gradient flows through domains with significant negative transfer. To this end, our model compares the performance of a model trained on multiple domains (CDSR) with a model trained solely on the specific domain (SDSR) to evaluate the negative transfer of each domain using our asymmetric cooperative network. In addition, to facilitate the transfer of valuable cues between the SDSR and CDSR tasks, we developed an auxiliary loss that maximizes the mutual information between the representation pairs from both tasks on a per-domain basis. This cooperative learning between SDSR and CDSR tasks is similar to the collaborative dynamics between pacers and runners in a marathon. Our model outperformed numerous previous works in extensive experiments on two real-world industrial datasets across ten service domains. We also have deployed our model in the recommendation system of our personal assistant app service, resulting in 21.4% increase in click-through rate compared to existing models, which is valuable to real-world business.
Paper Structure (31 sections, 19 equations, 3 figures, 5 tables)

This paper contains 31 sections, 19 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: In the Book, Clothing (Amazon dataset), and Call (Telco dataset) domains, the SDSR (trained with a single domain) outperformed the CDSR approach (trained with multiple domains), which indicates negative transfer from other domains. We used CGRec park2023cracking for the experiments.
  • Figure 2: We illustrate SyNCRec using three domains ($A$, $B$, $C$), each represented by a distinct color (blue, yellow, and pink). The notations $\oplus$, $\ominus$ and $\otimes$ indicate element-wise summation, subtraction, and multiplication, respectively. For the SDSR task for each domain, we input $X^{A}=[(\texttt{SOS}), x^{A}_{1},x^{A}_{3},x^{A}_{5}]$, $X^{B}=[(\texttt{SOS}), x^{B}_{2},x^{B}_{4}]$, and $X^{C}=[(\texttt{SOS}), x^{C}_{6}, x^{C}_{7}]$, and predict their shifted sequences $[x^{A}_{1},x^{A}_{3},x^{A}_{5},(\texttt{PAD})]$, $[x^{B}_{2},x^{B}_{4},x^{B}_{8}]$, and $[x^{C}_{6}, x^{C}_{7},(\texttt{PAD})]$. For the CDSR task, we input $X=[(\texttt{SOS}), x^{A}_{1}, x^{B}_{2}, x^{A}_{3}, x^{B}_{4}, x^{A}_{5}, x^{C}_{6}, x^{C}_{7}]$ into the model, and predict the shifted sequence $[x^{A}_{1}, x^{B}_{2}, x^{A}_{3}, x^{B}_{4}, x^{A}_{5}, x^{C}_{6}, x^{C}_{7}, x^{B}_{8}]$. We do not consider losses for (PAD) tokens.
  • Figure 3: Hyper-parameter study of our model.