Cold-start Recommendation by Personalized Embedding Region Elicitation
Hieu Trung Nguyen, Duy Nguyen, Khoa Doan, Viet Anh Nguyen
TL;DR
The paper addresses cold-start recommendation by introducing PERE, a two-phase elicitation framework that personalizes seed items via a DPP burn-in and then adaptively queries users to refine a region-based embedding of their preferences. Rather than a point estimate, the user is represented by a region in the embedding space, whose Chebyshev center serves as the safe focal point for making recommendations; the value of each new rating is tied to the contraction of this region. The method solves a series of linear programs to compute the Chebyshev center and selects subsequent questions by balancing a probability of item experience with the informativeness of the induced hyperplane cuts. Empirical results on Gowalla and Amazon-Books (with embeddings from LightGCN and biVAE) show that PERE outperforms baselines across multiple metrics and remains robust to misspecification, demonstrating practical improvements in cold-start scenarios with manageable computation times."
Abstract
Rating elicitation is a success element for recommender systems to perform well at cold-starting, in which the systems need to recommend items to a newly arrived user with no prior knowledge about the user's preference. Existing elicitation methods employ a fixed set of items to learn the user's preference and then infer the users' preferences on the remaining items. Using a fixed seed set can limit the performance of the recommendation system since the seed set is unlikely optimal for all new users with potentially diverse preferences. This paper addresses this challenge using a 2-phase, personalized elicitation scheme. First, the elicitation scheme asks users to rate a small set of popular items in a ``burn-in'' phase. Second, it sequentially asks the user to rate adaptive items to refine the preference and the user's representation. Throughout the process, the system represents the user's embedding value not by a point estimate but by a region estimate. The value of information obtained by asking the user's rating on an item is quantified by the distance from the region center embedding space that contains with high confidence the true embedding value of the user. Finally, the recommendations are successively generated by considering the preference region of the user. We show that each subproblem in the elicitation scheme can be efficiently implemented. Further, we empirically demonstrate the effectiveness of the proposed method against existing rating-elicitation methods on several prominent datasets.
