Table of Contents
Fetching ...

Correcting Popularity Bias in Recommender Systems via Item Loss Equalization

Juno Prent, Masoud Mansoury

TL;DR

This paper tackles popularity bias in recommender systems by introducing Item Loss Equalization (ILE), an in-processing constraint that minimizes the disparity of training losses across item groups defined by popularity (Head, Mid, Tail). The objective becomes $L^* = L + \lambda \mathcal{D}({L_g|g\in G})$, where $\mathcal{D}$ measures cross-group loss disparity and $\lambda$ controls the fairness emphasis. Experiments with Bayesian Probabilistic Ranking on three real-world datasets show that ILE improves fairness metrics (UPD, AD, EE) with negligible reductions in ranking accuracy (nDCG) compared to baselines such as CP, PUFR, and IPS. The results demonstrate robust improvements in fairness, especially on datasets with pronounced popularity bias, and indicate practical potential for deploying disparity-aware training in RS. The work lays groundwork for extending the idea to counteract additional biases like positivity and mainstream biases in future research.

Abstract

Recommender Systems (RS) often suffer from popularity bias, where a small set of popular items dominate the recommendation results due to their high interaction rates, leaving many less popular items overlooked. This phenomenon disproportionately benefits users with mainstream tastes while neglecting those with niche interests, leading to unfairness among users and exacerbating disparities in recommendation quality across different user groups. In this paper, we propose an in-processing approach to address this issue by intervening in the training process of recommendation models. Drawing inspiration from fair empirical risk minimization in machine learning, we augment the objective function of the recommendation model with an additional term aimed at minimizing the disparity in loss values across different item groups during the training process. Our approach is evaluated through extensive experiments on two real-world datasets and compared against state-of-the-art baselines. The results demonstrate the superior efficacy of our method in mitigating the unfairness of popularity bias while incurring only negligible loss in recommendation accuracy.

Correcting Popularity Bias in Recommender Systems via Item Loss Equalization

TL;DR

This paper tackles popularity bias in recommender systems by introducing Item Loss Equalization (ILE), an in-processing constraint that minimizes the disparity of training losses across item groups defined by popularity (Head, Mid, Tail). The objective becomes , where measures cross-group loss disparity and controls the fairness emphasis. Experiments with Bayesian Probabilistic Ranking on three real-world datasets show that ILE improves fairness metrics (UPD, AD, EE) with negligible reductions in ranking accuracy (nDCG) compared to baselines such as CP, PUFR, and IPS. The results demonstrate robust improvements in fairness, especially on datasets with pronounced popularity bias, and indicate practical potential for deploying disparity-aware training in RS. The work lays groundwork for extending the idea to counteract additional biases like positivity and mainstream biases in future research.

Abstract

Recommender Systems (RS) often suffer from popularity bias, where a small set of popular items dominate the recommendation results due to their high interaction rates, leaving many less popular items overlooked. This phenomenon disproportionately benefits users with mainstream tastes while neglecting those with niche interests, leading to unfairness among users and exacerbating disparities in recommendation quality across different user groups. In this paper, we propose an in-processing approach to address this issue by intervening in the training process of recommendation models. Drawing inspiration from fair empirical risk minimization in machine learning, we augment the objective function of the recommendation model with an additional term aimed at minimizing the disparity in loss values across different item groups during the training process. Our approach is evaluated through extensive experiments on two real-world datasets and compared against state-of-the-art baselines. The results demonstrate the superior efficacy of our method in mitigating the unfairness of popularity bias while incurring only negligible loss in recommendation accuracy.
Paper Structure (12 sections, 2 equations, 3 figures, 2 tables)

This paper contains 12 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Training losses per item group.
  • Figure 2: Trade-off between nDCG and fairness metrics for varying $\lambda$ values on each method.
  • Figure 3: Time complexity of our ILE method and baselines on MovieLens1M, Goodreads, and Google Reviews datasets. For CP and PUFR, the size of the long recommendation lists is 100.