Addressing bias in Recommender Systems: A Case Study on Data Debiasing Techniques in Mobile Games
Yixiong Wang, Maria Paskevich, Hui Wang
TL;DR
The paper tackles bias in mobile-game recommender systems under implicit feedback, addressing exposure, selection, and position biases that arise from limited in-game placements. It evaluates multiple debiasing strategies—IPS, Doubly Robust learning, and AutoDebias—across public explicit datasets and King’s internal implicit datasets, comparing them to a biased MF baseline using metrics $RMSE$, $AUC$, $NDCG@5$, $Gini$, and $Entropy$, plus training time. Key findings show IPS is simple and model-agnostic but offers modest gains; AutoDebias achieves strongest predictive improvements at high computational cost; DR improves diversification metrics with solid gains in some datasets, though benefits depend on data availability of randomized splits. The work provides practical guidance for deploying debiasing in game RS, highlighting trade-offs between data collection costs, accuracy, and diversity, and proposes future directions with online RL and hybrid methods to optimize both performance and efficiency.
Abstract
The mobile gaming industry, particularly the free-to-play sector, has been around for more than a decade, yet it still experiences rapid growth. The concept of games-as-service requires game developers to pay much more attention to recommendations of content in their games. With recommender systems (RS), the inevitable problem of bias in the data comes hand in hand. A lot of research has been done on the case of bias in RS for online retail or services, but much less is available for the specific case of the game industry. Also, in previous works, various debiasing techniques were tested on explicit feedback datasets, while it is much more common in mobile gaming data to only have implicit feedback. This case study aims to identify and categorize potential bias within datasets specific to model-based recommendations in mobile games, review debiasing techniques in the existing literature, and assess their effectiveness on real-world data gathered through implicit feedback. The effectiveness of these methods is then evaluated based on their debiasing quality, data requirements, and computational demands.
