On Inherited Popularity Bias in Cold-Start Item Recommendation

Gregor Meehan; Johan Pauwels

On Inherited Popularity Bias in Cold-Start Item Recommendation

Gregor Meehan, Johan Pauwels

TL;DR

This work studies how generative cold-start item recommenders inherit popularity bias from warm CF models, leading to overexposure of certain items when cold items lack interaction data. By analyzing Heater, GAR, and GoRec supervised by a pre-trained warm model (FREEDOM) across three multimedia datasets, the authors reveal that content-based predictions can map to high popularity for a subset of items, amplifying bias in cold-start predictions. They introduce a simple post-processing method that scales item embedding magnitudes using the rule $||\gamma_c x_c|| - \mu_w = \frac{||x_c|| - \mu_w}{1+\alpha}$ with $\gamma_c = \left(\frac{||x_c|| + \alpha \mu_w}{||x_c||(1+\alpha)}\right)$, thereby balancing exposure without severely harming user-level accuracy. Across datasets, this magnitude-based mitigation increases exposure diversity (higher $\text{Gini-Div}$) and improves low-end item MDG while maintaining overall performance, suggesting a practical route to fairer cold-start recommendations; code is released for replication.

Abstract

Collaborative filtering (CF) recommender systems struggle with making predictions on unseen, or 'cold', items. Systems designed to address this challenge are often trained with supervision from warm CF models in order to leverage collaborative and content information from the available interaction data. However, since they learn to replicate the behavior of CF methods, cold-start models may therefore also learn to imitate their predictive biases. In this paper, we show that cold-start systems can inherit popularity bias, a common cause of recommender system unfairness arising when CF models overfit to more popular items, thereby maximizing user-oriented accuracy but neglecting rarer items. We demonstrate that cold-start recommenders not only mirror the popularity biases of warm models, but are in fact affected more severely: because they cannot infer popularity from interaction data, they instead attempt to estimate it based solely on content features. This leads to significant over-prediction of certain cold items with similar content to popular warm items, even if their ground truth popularity is very low. Through experiments on three multimedia datasets, we analyze the impact of this behavior on three generative cold-start methods. We then describe a simple post-processing bias mitigation method that, by using embedding magnitude as a proxy for predicted popularity, can produce more balanced recommendations with limited harm to user-oriented cold-start accuracy.

On Inherited Popularity Bias in Cold-Start Item Recommendation

TL;DR

Abstract

On Inherited Popularity Bias in Cold-Start Item Recommendation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)