Learning Preference from Observed Rankings
Yu-Chang Chen, Chen Chian Fuh, Shang En Tsai
TL;DR
The paper tackles learning individual preferences from incomplete ranking data subject to exposure bias. It develops a flexible logistic framework that decomposes utility into interpretable attributes, item fixed effects, and a low-rank latent factor term, while correcting observability bias with inverse-probability weighting and ridge regularization. Estimation is scalable via SGD with inverse-probability resampling, and the method is demonstrated on online wine transaction data, delivering improved out-of-sample recommendations, especially for previously unconsumed products, and enabling market-level targeting through composition-based lifts. The results highlight substantial heterogeneity in origin and price preferences, the value of combining attribute information with latent structure, and practical managerial gains in both personalized recommendations and segment-focused marketing decisions.
Abstract
Estimating consumer preferences is central to many problems in economics and marketing. This paper develops a flexible framework for learning individual preferences from partial ranking information by interpreting observed rankings as collections of pairwise comparisons with logistic choice probabilities. We model latent utility as the sum of interpretable product attributes, item fixed effects, and a low-rank user-item factor structure, enabling both interpretability and information sharing across consumers and items. We further correct for selection in which comparisons are observed: a comparison is recorded only if both items enter the consumer's consideration set, inducing exposure bias toward frequently encountered items. We model pair observability as the product of item-level observability propensities and estimate these propensities with a logistic model for the marginal probability that an item is observable. Preference parameters are then estimated by maximizing an inverse-probability-weighted (IPW), ridge-regularized log-likelihood that reweights observed comparisons toward a target comparison population. To scale computation, we propose a stochastic gradient descent (SGD) algorithm based on inverse-probability resampling, which draws comparisons in proportion to their IPW weights. In an application to transaction data from an online wine retailer, the method improves out-of-sample recommendation performance relative to a popularity-based benchmark, with particularly strong gains in predicting purchases of previously unconsumed products.
