MARCO: A Cooperative Knowledge Transfer Framework for Personalized Cross-domain Recommendations
Lili Xie, Yi Zhang, Ruihong Qiu, Jiajun Liu, Sen Wang
TL;DR
MARCO tackles data sparsity in cold-start cross-domain recommendations by deploying a cooperative MARL framework where each agent estimates the contribution of a distinct source domain. It couples multi-source personalized bridges with MAPPO, and introduces an entropy-based action-diversity penalty to stabilize training and counter distributional discrepancies across domains. Empirical results on four Amazon sub-categories show MARCO achieving superior accuracy and robustness against negative transfer, with strong generalization to varying cold-start rates and source-domain configurations. The approach offers practical benefits for scalable, cross-domain personalization by effectively leveraging heterogeneous source-domain signals through coordinated, diverse agent policies.
Abstract
Recommender systems frequently encounter data sparsity issues, particularly when addressing cold-start scenarios involving new users or items. Multi-source cross-domain recommendation (CDR) addresses these challenges by transferring valuable knowledge from multiple source domains to enhance recommendations in a target domain. However, existing reinforcement learning (RL)-based CDR methods typically rely on a single-agent framework, leading to negative transfer issues caused by inconsistent domain contributions and inherent distributional discrepancies among source domains. To overcome these limitations, MARCO, a Multi-Agent Reinforcement Learning-based Cross-Domain recommendation framework, is proposed. It leverages cooperative multi-agent reinforcement learning, where each agent is dedicated to estimating the contribution from an individual source domain, effectively managing credit assignment and mitigating negative transfer. In addition, an entropy-based action diversity penalty is introduced to enhance policy expressiveness and stabilize training by encouraging diverse agents' joint actions. Extensive experiments across four benchmark datasets demonstrate MARCO's superior performance over state-of-the-art methods, highlighting its robustness and strong generalization capabilities. The code is at https://github.com/xiewilliams/MARCO.
