How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective

Siyi Lin; Chongming Gao; Jiawei Chen; Sheng Zhou; Binbin Hu; Yan Feng; Chun Chen; Can Wang

How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective

Siyi Lin, Chongming Gao, Jiawei Chen, Sheng Zhou, Binbin Hu, Yan Feng, Chun Chen, Can Wang

TL;DR

This work investigates why recommendation models amplify popularity bias when trained on long-tailed data. Using a spectral perspective, it reveals that item popularity is memorized in the principal spectrum of the predicted score matrix $\hat{\mathbf{Y}}$ and that dimension reduction magnifies this effect. The authors propose ReSN, a spectral-norm regularizer that directly penalizes the magnitude of the principal singular value with efficient approximations leveraging the low-rank structure $\mathbf{U}\mathbf{V}^{\top}$ and the popularity vector $\mathbf{r}$. The approach is supported by theoretical bounds and extensive experiments on seven real-world datasets showing improved fairness-accuracy trade-offs with minimal computational overhead. Overall, ReSN provides a scalable, model-agnostic tool for mitigating popularity bias in embedding-based recommender systems by targeting the dominant spectral component of predictions.

Abstract

Recommendation Systems (RS) are often plagued by popularity bias. When training a recommendation model on a typically long-tailed dataset, the model tends to not only inherit this bias but often exacerbate it, resulting in over-representation of popular items in the recommendation lists. This study conducts comprehensive empirical and theoretical analyses to expose the root causes of this phenomenon, yielding two core insights: 1) Item popularity is memorized in the principal spectrum of the score matrix predicted by the recommendation model; 2) The dimension collapse phenomenon amplifies the relative prominence of the principal spectrum, thereby intensifying the popularity bias. Building on these insights, we propose a novel debiasing strategy that leverages a spectral norm regularizer to penalize the magnitude of the principal singular value. We have developed an efficient algorithm to expedite the calculation of the spectral norm by exploiting the spectral property of the score matrix. Extensive experiments across seven real-world datasets and three testing paradigms have been conducted to validate the superiority of the proposed method.

How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective

TL;DR

and that dimension reduction magnifies this effect. The authors propose ReSN, a spectral-norm regularizer that directly penalizes the magnitude of the principal singular value with efficient approximations leveraging the low-rank structure

and the popularity vector

. The approach is supported by theoretical bounds and extensive experiments on seven real-world datasets showing improved fairness-accuracy trade-offs with minimal computational overhead. Overall, ReSN provides a scalable, model-agnostic tool for mitigating popularity bias in embedding-based recommender systems by targeting the dominant spectral component of predictions.

Abstract

Paper Structure (28 sections, 3 theorems, 30 equations, 7 figures, 11 tables)

This paper contains 28 sections, 3 theorems, 30 equations, 7 figures, 11 tables.

Introduction
Preliminaries
Understanding Popularity Bias Amplification
Popularity Bias Memorization Effect
Popularity Bias Amplification Effect
Proposed Method
ReSN: Regulation with Spectral Norm
Discussions
Experiments
Experiment Settings
Performance Comparison (RQ1)
Adaptability Exploration (RQ2)
Hyperparameter Study (RQ3)
Efficiency Study (RQ4)
Related Work
...and 13 more sections

Key Result

Theorem 1

Given an embedding-based recommendation model with sufficient capacity, when training the model on the data with power-law item popularity, the cosine similarity between item popularity $\mathbf{r}$ and the principal singular vector $\mathbf{q}_1$ of the predicted score matrix is bounded with: For $\alpha>2$, this can be further bounded with: where $r_{\max}$ is the popularity of the most popula

Figures (7)

Figure 1: Illustration of two core insights.
Figure 2: Illustration of how dimension reduction impacts popularity bias in Movielens: (a)-(b) the proportion of popular items in recommendations and the ratio of the largest singular value ($\sigma_1^2/\sum_{1\le k\le L} {\sigma_k^2}$) with varying embedding dimensions and training epochs, respectively; (c) how singular values evolves during training.
Figure 3: Pareto curves of compared methods illustrating the trade-off between accuracy and fairness under the common testing paradigm.
Figure 4: The proportion of popular items in recommendations and the ratio of the largest singular value ($\sigma_1^2/\sum_{1\le k\le L} {\sigma_k^2}$) and NDCG@20 with varying $\beta$.
Figure 5: Long-tailed distribution of item popularity in recommendation datasets.
...and 2 more figures

Theorems & Definitions (3)

Theorem 1: Popularity Memorization Effect
Theorem 2: Popularity bias amplification
Theorem 3: Trajectory of singular values (Eq. (6) in saxe2019mathematical)

How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective

TL;DR

Abstract

How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (3)