Table of Contents
Fetching ...

FedFlex: Federated Learning for Diverse Netflix Recommendations

Sven Lankester, Gustavo de Carvalho Bertoli, Matias Vizcaino, Emmanuelle Beauxis Aussalet, Manel Slokom

TL;DR

FedFlex pioneers a live federated recommender for Netflix-like content that preserves user privacy while promoting diverse recommendations. It combines on-device SVD and BPR fine-tuning with an MMR reranker and differential privacy, evaluated in a two-week user study. Results show model-dependent accuracy effects, with BPR+MMR significantly improving $nDCG@5$ and MMR generally boosting diversity metrics such as CR and reducing KL divergence in some settings, while user satisfaction remains largely stable. The work demonstrates feasibility and practical considerations for privacy-preserving, diversity-aware recommendations, with clear directions for scaling and refinement in future deployments.

Abstract

The drive for personalization in recommender systems creates a tension between user privacy and the risk of "filter bubbles". Although federated learning offers a promising paradigm for privacy-preserving recommendations, its impact on diversity remains unclear. We introduce FedFlex, a two-stage framework that combines local, on-device fine-tuning of matrix factorization models (SVD and BPR) with a lightweight Maximal Marginal Relevance (MMR) re-ranking step to promote diversity. We conducted the first live user study of a federated recommender, collecting behavioral data and feedback during a two-week online deployment. Our results show that FedFlex successfully engages users, with BPR outperforming SVD in click-through rate. Re-ranking with MMR consistently improved ranking quality (nDCG) across both models, with statistically significant gains, particularly for BPR. Diversity effects varied: MMR increased coverage for both models and improved intra-list diversity for BPR, but slightly reduced it for SVD, suggesting different interactions between personalization and diversification across models. Our exit questionnaire responses indicated that most users expressed no clear preference between re-ranked and unprocessed lists, implying that increased diversity did not substantially reduce user satisfaction.

FedFlex: Federated Learning for Diverse Netflix Recommendations

TL;DR

FedFlex pioneers a live federated recommender for Netflix-like content that preserves user privacy while promoting diverse recommendations. It combines on-device SVD and BPR fine-tuning with an MMR reranker and differential privacy, evaluated in a two-week user study. Results show model-dependent accuracy effects, with BPR+MMR significantly improving and MMR generally boosting diversity metrics such as CR and reducing KL divergence in some settings, while user satisfaction remains largely stable. The work demonstrates feasibility and practical considerations for privacy-preserving, diversity-aware recommendations, with clear directions for scaling and refinement in future deployments.

Abstract

The drive for personalization in recommender systems creates a tension between user privacy and the risk of "filter bubbles". Although federated learning offers a promising paradigm for privacy-preserving recommendations, its impact on diversity remains unclear. We introduce FedFlex, a two-stage framework that combines local, on-device fine-tuning of matrix factorization models (SVD and BPR) with a lightweight Maximal Marginal Relevance (MMR) re-ranking step to promote diversity. We conducted the first live user study of a federated recommender, collecting behavioral data and feedback during a two-week online deployment. Our results show that FedFlex successfully engages users, with BPR outperforming SVD in click-through rate. Re-ranking with MMR consistently improved ranking quality (nDCG) across both models, with statistically significant gains, particularly for BPR. Diversity effects varied: MMR increased coverage for both models and improved intra-list diversity for BPR, but slightly reduced it for SVD, suggesting different interactions between personalization and diversification across models. Our exit questionnaire responses indicated that most users expressed no clear preference between re-ranked and unprocessed lists, implying that increased diversity did not substantially reduce user satisfaction.

Paper Structure

This paper contains 30 sections, 13 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: The workflow of FedFlex.
  • Figure 2: An example of the local web application.
  • Figure 3: Recommendation accuracy and diversity with mean and confidence interval for both the SVD (top) and BPR (Bottom) experiments.
  • Figure 4: Distribution of clicked genres between raw and reranked recommendations for both the SVD and BPR.