Table of Contents
Fetching ...

Cold-start Recommendation by Personalized Embedding Region Elicitation

Hieu Trung Nguyen, Duy Nguyen, Khoa Doan, Viet Anh Nguyen

TL;DR

The paper addresses cold-start recommendation by introducing PERE, a two-phase elicitation framework that personalizes seed items via a DPP burn-in and then adaptively queries users to refine a region-based embedding of their preferences. Rather than a point estimate, the user is represented by a region in the embedding space, whose Chebyshev center serves as the safe focal point for making recommendations; the value of each new rating is tied to the contraction of this region. The method solves a series of linear programs to compute the Chebyshev center and selects subsequent questions by balancing a probability of item experience with the informativeness of the induced hyperplane cuts. Empirical results on Gowalla and Amazon-Books (with embeddings from LightGCN and biVAE) show that PERE outperforms baselines across multiple metrics and remains robust to misspecification, demonstrating practical improvements in cold-start scenarios with manageable computation times."

Abstract

Rating elicitation is a success element for recommender systems to perform well at cold-starting, in which the systems need to recommend items to a newly arrived user with no prior knowledge about the user's preference. Existing elicitation methods employ a fixed set of items to learn the user's preference and then infer the users' preferences on the remaining items. Using a fixed seed set can limit the performance of the recommendation system since the seed set is unlikely optimal for all new users with potentially diverse preferences. This paper addresses this challenge using a 2-phase, personalized elicitation scheme. First, the elicitation scheme asks users to rate a small set of popular items in a ``burn-in'' phase. Second, it sequentially asks the user to rate adaptive items to refine the preference and the user's representation. Throughout the process, the system represents the user's embedding value not by a point estimate but by a region estimate. The value of information obtained by asking the user's rating on an item is quantified by the distance from the region center embedding space that contains with high confidence the true embedding value of the user. Finally, the recommendations are successively generated by considering the preference region of the user. We show that each subproblem in the elicitation scheme can be efficiently implemented. Further, we empirically demonstrate the effectiveness of the proposed method against existing rating-elicitation methods on several prominent datasets.

Cold-start Recommendation by Personalized Embedding Region Elicitation

TL;DR

The paper addresses cold-start recommendation by introducing PERE, a two-phase elicitation framework that personalizes seed items via a DPP burn-in and then adaptively queries users to refine a region-based embedding of their preferences. Rather than a point estimate, the user is represented by a region in the embedding space, whose Chebyshev center serves as the safe focal point for making recommendations; the value of each new rating is tied to the contraction of this region. The method solves a series of linear programs to compute the Chebyshev center and selects subsequent questions by balancing a probability of item experience with the informativeness of the induced hyperplane cuts. Empirical results on Gowalla and Amazon-Books (with embeddings from LightGCN and biVAE) show that PERE outperforms baselines across multiple metrics and remains robust to misspecification, demonstrating practical improvements in cold-start scenarios with manageable computation times."

Abstract

Rating elicitation is a success element for recommender systems to perform well at cold-starting, in which the systems need to recommend items to a newly arrived user with no prior knowledge about the user's preference. Existing elicitation methods employ a fixed set of items to learn the user's preference and then infer the users' preferences on the remaining items. Using a fixed seed set can limit the performance of the recommendation system since the seed set is unlikely optimal for all new users with potentially diverse preferences. This paper addresses this challenge using a 2-phase, personalized elicitation scheme. First, the elicitation scheme asks users to rate a small set of popular items in a ``burn-in'' phase. Second, it sequentially asks the user to rate adaptive items to refine the preference and the user's representation. Throughout the process, the system represents the user's embedding value not by a point estimate but by a region estimate. The value of information obtained by asking the user's rating on an item is quantified by the distance from the region center embedding space that contains with high confidence the true embedding value of the user. Finally, the recommendations are successively generated by considering the preference region of the user. We show that each subproblem in the elicitation scheme can be efficiently implemented. Further, we empirically demonstrate the effectiveness of the proposed method against existing rating-elicitation methods on several prominent datasets.
Paper Structure (23 sections, 2 theorems, 33 equations, 7 figures, 12 tables)

This paper contains 23 sections, 2 theorems, 33 equations, 7 figures, 12 tables.

Key Result

Theorem 1

Suppose that $\mathcal{U}_{\mathbb P}$ has a non-empty interior. The Chebyshev center $u_c^\star$ of the set $\mathcal{U}_{\mathbb P}$ can be found by solving the following problem

Figures (7)

  • Figure 1: When a new user arrives, we use a determinantal point process to query a diverse set of items from the $P$popular items list to construct the burn-in questionnaire. Subsequently, we use a sequential question-answering procedure to refine the embedding region of the user's preferences. The recommendation is made using the Chebyshev center of the embedding region, which is consistent with the user's stated preferences.
  • Figure 2: The hyperplanes $2 u_{c}^\top (v_i - v_j) = \|v_i\|_2^2 - \|v_j\|_2^2$ for $i \succsim j \in \mathbb P$ are drawn as black lines, and they define the boundary of the set $\mathcal{U}_{\mathbb P}$. The ball centered at the Chebyshev center $u_c^\star$ with radius $r$ is the largest inscribed Euclidean ball of $\mathcal{U}_{\mathbb P}$. Our model recommends items based on the proximity to the Chebyshev center: here, two movies nearest to $u_c^\star$ are highlighted.
  • Figure 3: As the value of $\kappa_0$ increases, NDCG@50 increases under inconsistent preference setting.
  • Figure 4: Performance improvements with the dynamic questionnaire size on Amazon-Books and Gowalla datasets.
  • Figure 5: As the value of $\kappa_0$ increases, the probability that the user has prior experience (see Assumption \ref{['a:exp-prob']}) with an item is dampened. Plot with $d = 64$ and the maximal value of $c_{0i}$ is $\sqrt{d} = 8$.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Theorem 1: Chebyshev center
  • proof
  • Definition 1: $L$-ensemble DPP
  • Theorem 2: Chebyshev center with inconsistent elicitation
  • proof