Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems

Jin Huang; Harrie Oosterhuis; Masoud Mansoury; Herke van Hoof; Maarten de Rijke

Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems

Jin Huang, Harrie Oosterhuis, Masoud Mansoury, Herke van Hoof, Maarten de Rijke

TL;DR

This work addresses the inadequacy of single-factor debiasing in recommender systems by introducing multifactorial bias, where selection probabilities depend on both item and rating value. It proposes a Bayes-inspired multifactorial propensity estimator with Laplace smoothing and integrates it into an IPS-based MF framework, MF-IPS^{Mul}, enhanced by propensity smoothing and an alternating gradient descent optimization to combat data sparsity and instability. Empirical results on real-world data (Yahoo!R3, Coat) and semi-synthetic experiments demonstrate that MF-IPS^{Mul} offers the most robust and accurate bias correction, outperforming single-factor baselines and closely approaching an oracle with known propensities. The approach improves rating prediction accuracy under bias, suggesting a practical, robust path for debiasing in RSs, particularly when multiple factors jointly influence user interaction behavior. The authors also discuss limitations to explicit-feedback settings and propose future work extending the method to implicit feedback and broader recommendation scenarios.

Abstract

Two typical forms of bias in user interaction data with recommender systems (RSs) are popularity bias and positivity bias, which manifest themselves as the over-representation of interactions with popular items or items that users prefer, respectively. Debiasing methods aim to mitigate the effect of selection bias on the evaluation and optimization of RSs. However, existing debiasing methods only consider single-factor forms of bias, e.g., only the item (popularity) or only the rating value (positivity). This is in stark contrast with the real world where user selections are generally affected by multiple factors at once. In this work, we consider multifactorial selection bias in RSs. Our focus is on selection bias affected by both item and rating value factors, which is a generalization and combination of popularity and positivity bias. While the concept of multifactorial bias is intuitive, it brings a severe practical challenge as it requires substantially more data for accurate bias estimation. As a solution, we propose smoothing and alternating gradient descent techniques to reduce variance and improve the robustness of its optimization. Our experimental results reveal that, with our proposed techniques, multifactorial bias corrections are more effective and robust than single-factor counterparts on real-world and synthetic datasets.

Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems

TL;DR

Abstract

Paper Structure (20 sections, 20 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 20 sections, 20 equations, 5 figures, 2 tables, 1 algorithm.

Introduction
Conceptualization of selection bias
Preliminaries
Definition of selection bias
Rating prediction from user ratings
IPS-based debiasing method
Existing single-factor propensity estimation
Correction for Multifactorial Bias
Definition of multifactorial bias
Propensity estimate for multifactorial bias
A debiasing method for multifactorial bias
Experiments on Real-world Data
Experimental setup
Overall performance
Smoothing and alternating gradient descent
...and 5 more sections

Figures (5)

Figure 1: The dependency between observance (O), items (I), and rating values (Y) for different bias assumptions: (a) Positivity bias: propensities only depend on rating values; (b) Popularity bias: propensities only depend on items; (c) Multifactorial bias: propensities depend on both factors.
Figure 2: Skewed distributions of (a) rating values or (b) item popularity in the logged training set (train) of the Yahoo!R3 dataset, and (c) the number and average ratings of items in a group that contains items with the number of interactions falling within a certain interval are counted from logged user ratings on the self-selected songs in the Yahoo!R3 dataset.
Figure 3: (Yahoo!R3) The effect of varying smoothing parameters $\alpha_1$ and $\alpha_2$ on MSE obtained by our multifactorial method.
Figure 4: (Yahoo!R3) Learning curves tracking self-normal-ized IPS-weighted MSE on the validation set and MSE on the test set obtained by our multifactorial method. Results are means over 10 independent runs, shared areas show the 95% confident intervals calculated by using bootstrapping diciccio1996bootstrap.
Figure 5: Performance in our simulated setting with different dependencies of bias on item and rating value factors through varying $\gamma$ (x-axis, Eq. \ref{['eq:simulation']}). Results are means over 10 independent runs; shared areas show 95% bootstrap confident intervals diciccio1996bootstrap.

Theorems & Definitions (4)

definition 1: Selection bias
definition 2: Positivity bias
definition 3: Popularity bias
definition 4: Multifactorial bias

Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems

TL;DR

Abstract

Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (4)