Countering Mainstream Bias via End-to-End Adaptive Local Learning
Jinhao Pan, Ziwei Zhu, Jianling Wang, Allen Lin, James Caverlee
TL;DR
This work targets mainstream bias in collaborative filtering by identifying two root causes: discrepancy modeling and unsynchronized learning. It introduces end-to-end Adaptive Local Learning (TALL), which combines a Loss-Driven Mixture-of-Experts backbone with an Adaptive Weight module to produce customized local models and synchronize training across user groups, guided by $G_k(\mathbf{O}_{u})=\frac{e^{(\mathcal{L}(E_k(\mathbf{O}_{u})))^{-1}}}{\sum_{t=1}^{n_e} e^{(\mathcal{L}(E_t(\mathbf{O}_{u})))^{-1}}}$ and the weight update $w_{u}^{*}=\max\left( \frac{\mathcal{L}(\mathbf{O}_{u}, \widehat{\mathbf{O}}_{u})-\lambda}{2\alpha}, 0\right)$. A gap mechanism defers adaptive weighting for the first $T$ epochs, and a loss-change signal $\Delta \mathcal{L}^{t}_{u}=\mathcal{L}^{t-1}_{u}-\mathcal{L}^{t}_{u}$ (averaged over $L$ epochs) stabilizes updates. Experiments on ML1M, Yelp, and CDs&Vinyl demonstrate state-of-the-art debiasing, with notable gains for niche users (e.g., about a 6.07% improvement over LOCA and 10% over EnLFT on average for the low-mainstream group) while preserving mainstream performance. Code and data are released at the provided GitHub URL.
Abstract
Collaborative filtering (CF) based recommendations suffer from mainstream bias -- where mainstream users are favored over niche users, leading to poor recommendation quality for many long-tail users. In this paper, we identify two root causes of this mainstream bias: (i) discrepancy modeling, whereby CF algorithms focus on modeling mainstream users while neglecting niche users with unique preferences; and (ii) unsynchronized learning, where niche users require more training epochs than mainstream users to reach peak performance. Targeting these causes, we propose a novel end-To-end Adaptive Local Learning (TALL) framework to provide high-quality recommendations to both mainstream and niche users. TALL uses a loss-driven Mixture-of-Experts module to adaptively ensemble experts to provide customized local models for different users. Further, it contains an adaptive weight module to synchronize the learning paces of different users by dynamically adjusting weights in the loss. Extensive experiments demonstrate the state-of-the-art performance of the proposed model. Code and data are provided at \url{https://github.com/JP-25/end-To-end-Adaptive-Local-Leanring-TALL-}
