A Soft-partitioned Semi-supervised Collaborative Transfer Learning Approach for Multi-Domain Recommendation
Xiaoyu Liu, Yiqing Wu, Ruidong Han, Fuzhen Zhuang, Xiang Li, Wei Lin
TL;DR
This work tackles multi-domain recommendation under severe domain imbalance, where the dominant domain often overwhelms learning and non-dominant domains risk overfitting. It introduces Soft-partitioned Semi-supervised Collaborative Transfer Learning (SSCTL), combining Instance Soft-partitioned Collaborative Training (ISCT) and Soft-partitioned Domain Differentiation Network (SDDN) to enable dynamic, domain-aware parameterization and cross-domain transfer. Key contributions include a soft-partitioning mechanism, the SSCTL framework, and extensive offline and online validation showing improvements in GMV ($0.54\%-2.90\%$) and CTR ($0.22\%-1.69\%$) across domains. The approach offers practical gains for industry-scale MDR systems by leveraging dominant-domain data to benefit non-dominant domains while mitigating overfitting and data overwhelm.
Abstract
In industrial practice, Multi-domain Recommendation (MDR) plays a crucial role. Shared-specific architectures are widely used in industrial solutions to capture shared and unique attributes via shared and specific parameters. However, with imbalanced data across different domains, these models face two key issues: (1) Overwhelming: Dominant domain data skews model performance, neglecting non-dominant domains. (2) Overfitting: Sparse data in non-dominant domains leads to overfitting in specific parameters. To tackle these challenges, we propose Soft-partitioned Semi-supervised Collaborative Transfer Learning (SSCTL) for multi-domain recommendation. SSCTL generates dynamic parameters to address the overwhelming issue, thus shifting focus towards samples from non-dominant domains. To combat overfitting, it leverages pseudo-labels with weights from dominant domain instances to enhance non-dominant domain data. We conduct comprehensive experiments, both online and offline, to validate the efficacy of our proposed method. Online tests yielded significant improvements across various domains, with increases in GMV ranging from 0.54% to 2.90% and enhancements in CTR ranging from 0.22% to 1.69%.
