Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems
Jin Huang, Harrie Oosterhuis, Masoud Mansoury, Herke van Hoof, Maarten de Rijke
TL;DR
This work addresses the inadequacy of single-factor debiasing in recommender systems by introducing multifactorial bias, where selection probabilities depend on both item and rating value. It proposes a Bayes-inspired multifactorial propensity estimator with Laplace smoothing and integrates it into an IPS-based MF framework, MF-IPS^{Mul}, enhanced by propensity smoothing and an alternating gradient descent optimization to combat data sparsity and instability. Empirical results on real-world data (Yahoo!R3, Coat) and semi-synthetic experiments demonstrate that MF-IPS^{Mul} offers the most robust and accurate bias correction, outperforming single-factor baselines and closely approaching an oracle with known propensities. The approach improves rating prediction accuracy under bias, suggesting a practical, robust path for debiasing in RSs, particularly when multiple factors jointly influence user interaction behavior. The authors also discuss limitations to explicit-feedback settings and propose future work extending the method to implicit feedback and broader recommendation scenarios.
Abstract
Two typical forms of bias in user interaction data with recommender systems (RSs) are popularity bias and positivity bias, which manifest themselves as the over-representation of interactions with popular items or items that users prefer, respectively. Debiasing methods aim to mitigate the effect of selection bias on the evaluation and optimization of RSs. However, existing debiasing methods only consider single-factor forms of bias, e.g., only the item (popularity) or only the rating value (positivity). This is in stark contrast with the real world where user selections are generally affected by multiple factors at once. In this work, we consider multifactorial selection bias in RSs. Our focus is on selection bias affected by both item and rating value factors, which is a generalization and combination of popularity and positivity bias. While the concept of multifactorial bias is intuitive, it brings a severe practical challenge as it requires substantially more data for accurate bias estimation. As a solution, we propose smoothing and alternating gradient descent techniques to reduce variance and improve the robustness of its optimization. Our experimental results reveal that, with our proposed techniques, multifactorial bias corrections are more effective and robust than single-factor counterparts on real-world and synthetic datasets.
