Table of Contents
Fetching ...

Oh, Behave! Country Representation Dynamics Created by Feedback Loops in Music Recommender Systems

Oleg Lesota, Jonas Geiger, Max Walder, Dominik Kowald, Markus Schedl

TL;DR

This work tackles the problem of how feedback loops in music recommender systems shape country-level representation, focusing on the balance between local and US music exposure. It uses an offline long-run simulation on the LFM-2b dataset with six diverse models, evaluating representations via US/local proportions and miscalibration with Jensen-Shannon-divergence. Key findings show a general trend toward US over-representation and local under-representation in recommendations, with substantial cross-country variation and NeuMF driving large distribution shifts; importantly, popularity calibration does not guarantee country calibration. The results underscore the need for mitigation strategies that explicitly address country calibration, language and cultural signals, and different user behavior models to ensure fairer, more representative recommendations.

Abstract

Recent work suggests that music recommender systems are prone to disproportionally frequent recommendations of music from countries more prominently represented in the training data, notably the US. However, it remains unclear to what extent feedback loops in music recommendation influence the dynamics of such imbalance. In this work, we investigate the dynamics of representation of local (i.e., country-specific) and US-produced music in user profiles and recommendations. To this end, we conduct a feedback loop simulation study using the standardized LFM-2b dataset. The results suggest that most of the investigated recommendation models decrease the proportion of music from local artists in their recommendations. Furthermore, we find that models preserving average proportions of US and local music do not necessarily provide country-calibrated recommendations. We also look into popularity calibration and, surprisingly, find that the most popularity-calibrated model in our study (ItemKNN) provides the least country-calibrated recommendations. In addition, users from less represented countries (e.g., Finland) are, in the long term, most affected by the under-representation of their local music in recommendations.

Oh, Behave! Country Representation Dynamics Created by Feedback Loops in Music Recommender Systems

TL;DR

This work tackles the problem of how feedback loops in music recommender systems shape country-level representation, focusing on the balance between local and US music exposure. It uses an offline long-run simulation on the LFM-2b dataset with six diverse models, evaluating representations via US/local proportions and miscalibration with Jensen-Shannon-divergence. Key findings show a general trend toward US over-representation and local under-representation in recommendations, with substantial cross-country variation and NeuMF driving large distribution shifts; importantly, popularity calibration does not guarantee country calibration. The results underscore the need for mitigation strategies that explicitly address country calibration, language and cultural signals, and different user behavior models to ensure fairer, more representative recommendations.

Abstract

Recent work suggests that music recommender systems are prone to disproportionally frequent recommendations of music from countries more prominently represented in the training data, notably the US. However, it remains unclear to what extent feedback loops in music recommendation influence the dynamics of such imbalance. In this work, we investigate the dynamics of representation of local (i.e., country-specific) and US-produced music in user profiles and recommendations. To this end, we conduct a feedback loop simulation study using the standardized LFM-2b dataset. The results suggest that most of the investigated recommendation models decrease the proportion of music from local artists in their recommendations. Furthermore, we find that models preserving average proportions of US and local music do not necessarily provide country-calibrated recommendations. We also look into popularity calibration and, surprisingly, find that the most popularity-calibrated model in our study (ItemKNN) provides the least country-calibrated recommendations. In addition, users from less represented countries (e.g., Finland) are, in the long term, most affected by the under-representation of their local music in recommendations.
Paper Structure (9 sections, 1 equation, 2 figures, 4 tables)

This paper contains 9 sections, 1 equation, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Proportion of local items recommended by different algorithms. The dashed line shows the average consumption of local music before the simulation.
  • Figure 2: Miscalibration between the interaction history at iteration $i$ and the initial interaction history in terms of country (left) and popularity (right). Measured as $JSD$.