LTP-MMF: Towards Long-term Provider Max-min Fairness Under Recommendation Feedback Loops
Chen Xu, Xiaopeng Ye, Jun Xu, Xiao Zhang, Weiran Shen, Ji-Rong Wen
TL;DR
This work tackles long-term provider fairness in multi-stakeholder recommender systems under recommendation feedback loops by introducing LTP-MMF, an online ranking model that blends matrix-factorization accuracy with a dual MMF-based fairness module and UCB-driven exploration. Framed as batched context-bandits, the method optimizes the objective $R=\frac{1}{T}\sum_t g(\\mathbf{x}_t) + \lambda r(\\mathbf{e})$ with exponential emphasis on fairness to uplift worst-off providers while maintaining user satisfaction. The approach yields sub-linear regret bounds and demonstrates superior long-term performance on four public datasets, with notable Pareto improvements and efficient online inference. Overall, LTP-MMF offers a scalable, fair, and effective framework for balancing long-term provider exposure and user utility in dynamic recommendation settings.
Abstract
Multi-stakeholder recommender systems involve various roles, such as users, and providers. Previous work pointed out that max-min fairness (MMF) is a better metric to support weak providers. However, when considering MMF, the features or parameters of these roles vary over time, how to ensure long-term provider MMF has become a significant challenge. We observed that recommendation feedback loops (named RFL) will greatly influence the provider MMF in the long term. RFL means that recommender systems can only receive feedback on exposed items from users and update recommender models incrementally based on this feedback. When utilizing the feedback, the recommender model will regard the unexposed items as negative. In this way, the tail provider will not get the opportunity to be exposed, and its items will always be considered negative samples. Such phenomena will become more and more serious in RFL. To alleviate the problem, this paper proposes an online ranking model named Long-Term Provider Max-min Fairness (named LTP-MMF). Theoretical analysis shows that the long-term regret of LTP-MMF enjoys a sub-linear bound. Experimental results on three public recommendation benchmarks demonstrated that LTP-MMF can outperform the baselines in the long term.
