FedSlate:A Federated Deep Reinforcement Learning Recommender System
Yongxin Deng, Xihe Qiu, Xiaoyu Tan, Yaochu Jin
TL;DR
FedSlate tackles cross-platform privacy in recommender systems by integrating SlateQ with federated reinforcement learning. The method maintains local Q-networks on each platform and trains a central federated Q-network that combines their outputs to guide slate selection, enabling cross-platform LTV optimization without sharing raw data. Experiments in RecSim show FedSlate can accelerate learning for platforms with feedback and enable feedback-lacking platforms to benefit from federation, while ablations confirm the necessity of the federated architecture. The work advances privacy-preserving, multi-platform RL for long-term user engagement and points to future enhancements in cost-efficiency and fairness.
Abstract
Reinforcement learning methods have been used to optimize long-term user engagement in recommendation systems. However, existing reinforcement learning-based recommendation systems do not fully exploit the relevance of individual user behavior across different platforms. One potential solution is to aggregate data from various platforms in a centralized location and use the aggregated data for training. However, this approach raises economic and legal concerns, including increased communication costs and potential threats to user privacy. To address these challenges, we propose \textbf{FedSlate}, a federated reinforcement learning recommendation algorithm that effectively utilizes information that is prohibited from being shared at a legal level. We employ the SlateQ algorithm to assist FedSlate in learning users' long-term behavior and evaluating the value of recommended content. We extend the existing application scope of recommendation systems from single-user single-platform to single-user multi-platform and address cross-platform learning challenges by introducing federated learning. We use RecSim to construct a simulation environment for evaluating FedSlate and compare its performance with state-of-the-art benchmark recommendation models. Experimental results demonstrate the superior effects of FedSlate over baseline methods in various environmental settings, and FedSlate facilitates the learning of recommendation strategies in scenarios where baseline methods are completely inapplicable. Code is available at \textit{https://github.com/TianYaDY/FedSlate}.
