Treatment Effect Estimation for User Interest Exploration on Recommender Systems
Jiaju Chen, Wenjie Wang, Chongming Gao, Peng Wu, Jianxiong Wei, Qingsong Hua
TL;DR
This paper addresses bias in user feedback that obscures hidden interests in recommender systems by reframing top-N category exposure as a causal treatment optimization problem. It introduces UpliftRec, which estimates multivariate ADRF from observational data using inverse propensity weighting, discretizes treatments, and uses dynamic programming to maximize overall CTR; it also provides a variance-reducing MTEF variant to adjust backend scores. The authors validate the approach on three real-world datasets, showing improved accuracy and serendipity over diverse baselines and across backends, with ablations confirming the value of IPW and MTEF. The work advances practical uplift modeling in recommendation, enabling more effective discovery of latent interests while maintaining recommendation quality, and it releases code and data for reproducibility.
Abstract
Recommender systems learn personalized user preferences from user feedback like clicks. However, user feedback is usually biased towards partially observed interests, leaving many users' hidden interests unexplored. Existing approaches typically mitigate the bias, increase recommendation diversity, or use bandit algorithms to balance exploration-exploitation trade-offs. Nevertheless, they fail to consider the potential rewards of recommending different categories of items and lack the global scheduling of allocating top-N recommendations to categories, leading to suboptimal exploration. In this work, we propose an Uplift model-based Recommender (UpliftRec) framework, which regards top-N recommendation as a treatment optimization problem. UpliftRec estimates the treatment effects, i.e., the click-through rate (CTR) under different category exposure ratios, by using observational user feedback. UpliftRec calculates group-level treatment effects to discover users' hidden interests with high CTR rewards and leverages inverse propensity weighting to alleviate confounder bias. Thereafter, UpliftRec adopts a dynamic programming method to calculate the optimal treatment for overall CTR maximization. We implement UpliftRec on different backend models and conduct extensive experiments on three datasets. The empirical results validate the effectiveness of UpliftRec in discovering users' hidden interests while achieving superior recommendation accuracy.
