Table of Contents
Fetching ...

Rethinking Cross-Domain Sequential Recommendation under Open-World Assumptions

Wujiang Xu, Qitian Wu, Runzhong Wang, Mingming Ha, Qiongxu Ma, Linxun Chen, Bing Han, Junchi Yan

TL;DR

This work rethinks cross-domain sequential recommendation under open-world assumptions where overlapping users are sparse and data distributions shift between offline and online settings. It introduces Adaptive Multi-Interest Debiasing (AMID), a framework combining a Multi-Interest Information Module (MIM) to propagate cross-domain information across both overlapping and non-overlapping users and a Doubly Robust Estimator (DRE) to yield unbiased cross-domain evaluations. The authors provide theoretical analysis showing that DRE achieves smaller bias and tighter tail bounds than traditional IPS estimators, enhancing reliability under distribution shifts. Empirically, AMID improves performance across CDSR benchmarks, validates on a real-world MYbank-CDR dataset, and yields substantial online gains in exposure, clicks, and conversion on financial platforms, illustrating strong practical impact in open-world settings.

Abstract

Cross-Domain Sequential Recommendation (CDSR) methods aim to tackle the data sparsity and cold-start problems present in Single-Domain Sequential Recommendation (SDSR). Existing CDSR works design their elaborate structures relying on overlapping users to propagate the cross-domain information. However, current CDSR methods make closed-world assumptions, assuming fully overlapping users across multiple domains and that the data distribution remains unchanged from the training environment to the test environment. As a result, these methods typically result in lower performance on online real-world platforms due to the data distribution shifts. To address these challenges under open-world assumptions, we design an \textbf{A}daptive \textbf{M}ulti-\textbf{I}nterest \textbf{D}ebiasing framework for cross-domain sequential recommendation (\textbf{AMID}), which consists of a multi-interest information module (\textbf{MIM}) and a doubly robust estimator (\textbf{DRE}). Our framework is adaptive for open-world environments and can improve the model of most off-the-shelf single-domain sequential backbone models for CDSR. Our MIM establishes interest groups that consider both overlapping and non-overlapping users, allowing us to effectively explore user intent and explicit interest. To alleviate biases across multiple domains, we developed the DRE for the CDSR methods. We also provide a theoretical analysis that demonstrates the superiority of our proposed estimator in terms of bias and tail bound, compared to the IPS estimator used in previous work.

Rethinking Cross-Domain Sequential Recommendation under Open-World Assumptions

TL;DR

This work rethinks cross-domain sequential recommendation under open-world assumptions where overlapping users are sparse and data distributions shift between offline and online settings. It introduces Adaptive Multi-Interest Debiasing (AMID), a framework combining a Multi-Interest Information Module (MIM) to propagate cross-domain information across both overlapping and non-overlapping users and a Doubly Robust Estimator (DRE) to yield unbiased cross-domain evaluations. The authors provide theoretical analysis showing that DRE achieves smaller bias and tighter tail bounds than traditional IPS estimators, enhancing reliability under distribution shifts. Empirically, AMID improves performance across CDSR benchmarks, validates on a real-world MYbank-CDR dataset, and yields substantial online gains in exposure, clicks, and conversion on financial platforms, illustrating strong practical impact in open-world settings.

Abstract

Cross-Domain Sequential Recommendation (CDSR) methods aim to tackle the data sparsity and cold-start problems present in Single-Domain Sequential Recommendation (SDSR). Existing CDSR works design their elaborate structures relying on overlapping users to propagate the cross-domain information. However, current CDSR methods make closed-world assumptions, assuming fully overlapping users across multiple domains and that the data distribution remains unchanged from the training environment to the test environment. As a result, these methods typically result in lower performance on online real-world platforms due to the data distribution shifts. To address these challenges under open-world assumptions, we design an \textbf{A}daptive \textbf{M}ulti-\textbf{I}nterest \textbf{D}ebiasing framework for cross-domain sequential recommendation (\textbf{AMID}), which consists of a multi-interest information module (\textbf{MIM}) and a doubly robust estimator (\textbf{DRE}). Our framework is adaptive for open-world environments and can improve the model of most off-the-shelf single-domain sequential backbone models for CDSR. Our MIM establishes interest groups that consider both overlapping and non-overlapping users, allowing us to effectively explore user intent and explicit interest. To alleviate biases across multiple domains, we developed the DRE for the CDSR methods. We also provide a theoretical analysis that demonstrates the superiority of our proposed estimator in terms of bias and tail bound, compared to the IPS estimator used in previous work.
Paper Structure (25 sections, 16 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 25 sections, 16 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: While traditional methods ma2019pizhang2018crosszhao2017unified focus only on overlapping users (a) and a few methods li2020ddtcdrli2021dualcao2022contrastive can handle non-overlapping users (b), they still have some limitations. However, our method not only considers users (a) and (b), but also assigns importance to unseen users (c).
  • Figure 2: Solid lines denote the SDSR methods, while dashed lines denote the CDSR methods. Due to the lack of abundant overlapping users, SASRec (SDSR) outperforms all the CDSR methods in the Movie domain.
  • Figure 3: Selection bias can cause a distribution shift in the cross-domain sequential scenario with few overlapping users (control ratio is 25%).
  • Figure 4: The casual graph of the selection bias in CDSR. Grey and white variables represent latent and observed variables, respectively.
  • Figure 5: Overview of our multi-interest information module. The encoder denotes the sequential information encoder from the SDSR model. The black (i.e., $u^{Z_1}$$\rightarrow$$u_1$) and blue (i.e., $u^{Z_1}$$\rightarrow$$u^{Z_2}$) solid arrow denote two different types of messages propagated by the different users and the same user in different domains.
  • ...and 2 more figures