Towards Long-Term User Welfare in Recommender Systems via Creator-Oriented Information Revelation
Xu Zhao, Xiaopeng Ye, Chen Xu, Weiran Shen, Jun Xu
TL;DR
This work addresses the challenge of sustaining long-term user welfare in recommender systems by explicitly considering how creators shape the content ecosystem. It introduces LoRe, an information-revelation framework that casts the platform as a sender and creators as receivers, using Bayesian persuasion within a Markov decision process and reinforcement learning to optimize signaling under bounded rationality. The approach yields an online-inference and offline-training pipeline with a trust-prediction component, and experiments on YouTube and Amazon demonstrate superior long-term welfare gains relative to re-ranking and naive signaling baselines. The findings suggest that strategically revealing information to creators can improve content diversity, creator retention, and user engagement, offering a practical and integrable route to healthier RS ecosystems.
Abstract
Improving the long-term user welfare (e.g., sustained user engagement) has become a central objective of recommender systems (RS). In real-world platforms, the creation behaviors of content creators plays a crucial role in shaping long-term welfare beyond short-term recommendation accuracy, making the effective steering of creator behavior essential to foster a healthier RS ecosystem. Existing works typically rely on re-ranking algorithms that heuristically adjust item exposure to steer creators' behavior. However, when embedded within recommendation pipelines, such a strategy often conflicts with the short-term objective of improving recommendation accuracy, leading to performance degradation and suboptimal long-term welfare. The well-established economics studies offer us valuable insights for an alternative approach without relying on recommendation algorithmic design: revealing information from an information-rich party (sender) to a less-informed party (receiver) can effectively change the receiver's beliefs and steer their behavior. Inspired by this idea, we propose an information-revealing framework, named Long-term Welfare Optimization via Information Revelation (LoRe). In this framework, we utilize a classical information revelation method (i.e., Bayesian persuasion) to map the stakeholders in RS, treating the platform as the sender and creators as the receivers. To address the challenge posed by the unrealistic assumption of traditional economic methods, we formulate the process of information revelation as a Markov Decision Process (MDP) and propose a learning algorithm trained and inferred in environments with boundedly rational creators. Extensive experiments on two real-world RS datasets demonstrate that our method can effectively outperform existing fair re-ranking methods and information revealing strategies in improving long-term user welfare.
