Table of Contents
Fetching ...

Dynamic Assortment Selection and Pricing with Censored Preference Feedback

Jung-hun Kim, Min-hwan Oh

TL;DR

This work introduces dynamic multi-product assortment and pricing under a censored multinomial logit (C-MNL) model, where buyers filter out items priced above their valuation and purchase at most one item. It proposes a Lower Confidence Bound (LCB) pricing strategy combined with either UCB or Thompson Sampling (TS) for assortment selection, enabling learning of valuations and price sensitivities from censored feedback. The authors establish regret bounds of $\tilde{O}(d^{3/2}\sqrt{T/\kappa})$ for the UCBA-LCBP algorithm and $\tilde{O}(d^{2}\sqrt{T/\kappa})$ for the TS-based TSA-LCBP, with additional terms that depend on the problem dimension $d$ and the nonlinearity constant $\kappa$. Empirical results on synthetic datasets corroborate the theoretical findings, showing sublinear regret and robustness to activation censorship, thereby offering practical strategies for dynamic pricing and assortment in censoring environments.

Abstract

In this study, we investigate the problem of dynamic multi-product selection and pricing by introducing a novel framework based on a \textit{censored multinomial logit} (C-MNL) choice model. In this model, sellers present a set of products with prices, and buyers filter out products priced above their valuation, purchasing at most one product from the remaining options based on their preferences. The goal is to maximize seller revenue by dynamically adjusting product offerings and prices, while learning both product valuations and buyer preferences through purchase feedback. To achieve this, we propose a Lower Confidence Bound (LCB) pricing strategy. By combining this pricing strategy with either an Upper Confidence Bound (UCB) or Thompson Sampling (TS) product selection approach, our algorithms achieve regret bounds of $\tilde{O}(d^{\frac{3}{2}}\sqrt{T/κ})$ and $\tilde{O}(d^{2}\sqrt{T/κ})$, respectively. Finally, we validate the performance of our methods through simulations, demonstrating their effectiveness.

Dynamic Assortment Selection and Pricing with Censored Preference Feedback

TL;DR

This work introduces dynamic multi-product assortment and pricing under a censored multinomial logit (C-MNL) model, where buyers filter out items priced above their valuation and purchase at most one item. It proposes a Lower Confidence Bound (LCB) pricing strategy combined with either UCB or Thompson Sampling (TS) for assortment selection, enabling learning of valuations and price sensitivities from censored feedback. The authors establish regret bounds of for the UCBA-LCBP algorithm and for the TS-based TSA-LCBP, with additional terms that depend on the problem dimension and the nonlinearity constant . Empirical results on synthetic datasets corroborate the theoretical findings, showing sublinear regret and robustness to activation censorship, thereby offering practical strategies for dynamic pricing and assortment in censoring environments.

Abstract

In this study, we investigate the problem of dynamic multi-product selection and pricing by introducing a novel framework based on a \textit{censored multinomial logit} (C-MNL) choice model. In this model, sellers present a set of products with prices, and buyers filter out products priced above their valuation, purchasing at most one product from the remaining options based on their preferences. The goal is to maximize seller revenue by dynamically adjusting product offerings and prices, while learning both product valuations and buyer preferences through purchase feedback. To achieve this, we propose a Lower Confidence Bound (LCB) pricing strategy. By combining this pricing strategy with either an Upper Confidence Bound (UCB) or Thompson Sampling (TS) product selection approach, our algorithms achieve regret bounds of and , respectively. Finally, we validate the performance of our methods through simulations, demonstrating their effectiveness.

Paper Structure

This paper contains 24 sections, 18 theorems, 123 equations, 2 figures, 1 table, 3 algorithms.

Key Result

Theorem 1

Under Assumption ass:bd, the policy $\pi$ of Algorithm alg:ucb achieves a regret bound of

Figures (2)

  • Figure 1: The illustration describes the process involved in making a purchase.
  • Figure 2: Experimental results for the regret of algorithms

Theorems & Definitions (19)

  • Definition 1: Censored multinomial logit choice model
  • Theorem 1
  • Theorem 2
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Lemma 7
  • ...and 9 more