A First Look at Selection Bias in Preference Elicitation for Recommendation
Shashank Gupta, Harrie Oosterhuis, Maarten de Rijke
TL;DR
Problem: selection bias in preference elicitation (PE) for recommendation systems has been overlooked. Approach: the paper introduces a simulator to generate PE interactions from static data and adapts inverse propensity scoring (IPS) for debiasing in topic-level PE, evaluated on semi-synthetic Yahoo! R3 data and a fully synthetic setting with synthetic topic generation. Key findings: ignoring PE bias leads to biased representations and degraded downstream recommendations, while IPS-based debiasing significantly improves MAE, MSE, and NDCG@3 under various settings; results underscore the need for debiasing in PE and the utility of the simulator. Contributions: first empirical exploration of selection bias in PE, publicly released simulator and code, and a foundation for future joint debiasing across PE and downstream tasks.
Abstract
Preference elicitation explicitly asks users what kind of recommendations they would like to receive. It is a popular technique for conversational recommender systems to deal with cold-starts. Previous work has studied selection bias in implicit feedback, e.g., clicks, and in some forms of explicit feedback, i.e., ratings on items. Despite the fact that the extreme sparsity of preference elicitation interactions make them severely more prone to selection bias than natural interactions, the effect of selection bias in preference elicitation on the resulting recommendations has not been studied yet. To address this gap, we take a first look at the effects of selection bias in preference elicitation and how they may be further investigated in the future. We find that a big hurdle is the current lack of any publicly available dataset that has preference elicitation interactions. As a solution, we propose a simulation of a topic-based preference elicitation process. The results from our simulation-based experiments indicate (i) that ignoring the effect of selection bias early in preference elicitation can lead to an exacerbation of overrepresentation in subsequent item recommendations, and (ii) that debiasing methods can alleviate this effect, which leads to significant improvements in subsequent item recommendation performance. Our aim is for the proposed simulator and initial results to provide a starting point and motivation for future research into this important but overlooked problem setting.
