Table of Contents
Fetching ...

Dynamic Pricing with Adversarially-Censored Demands

Jianyu Xu, Yining Wang, Xi Chen, Yu-Xiang Wang

TL;DR

This work tackles online dynamic pricing with adversarially censored demand due to perishable inventories. It introduces the C20CB algorithm, which first performs pure exploration to estimate the linear demand parameters $a$ and $b$ from biased censored observations, then uses an optimistic, derivative-based pricing rule to learn the noise distribution and approach the time-varying optimal price. The authors prove a high-probability regret bound of $ ilde{O}( ext{sqrt}(T))$, matching information-theoretic lower bounds for related settings, and demonstrate the method’s robustness to adversarial inventory sequences. The proposed framework advances online decision-making under censored feedback and has potential extensions to contextual pricing, unbounded-noise scenarios, and non-linear demand, with significant implications for pricing strategies on perishable goods.

Abstract

We study an online dynamic pricing problem where the potential demand at each time period $t=1,2,\ldots, T$ is stochastic and dependent on the price. However, a perishable inventory is imposed at the beginning of each time $t$, censoring the potential demand if it exceeds the inventory level. To address this problem, we introduce a pricing algorithm based on the optimistic estimates of derivatives. We show that our algorithm achieves $\tilde{O}(\sqrt{T})$ optimal regret even with adversarial inventory series. Our findings advance the state-of-the-art in online decision-making problems with censored feedback, offering a theoretically optimal solution against adversarial observations.

Dynamic Pricing with Adversarially-Censored Demands

TL;DR

This work tackles online dynamic pricing with adversarially censored demand due to perishable inventories. It introduces the C20CB algorithm, which first performs pure exploration to estimate the linear demand parameters and from biased censored observations, then uses an optimistic, derivative-based pricing rule to learn the noise distribution and approach the time-varying optimal price. The authors prove a high-probability regret bound of , matching information-theoretic lower bounds for related settings, and demonstrate the method’s robustness to adversarial inventory sequences. The proposed framework advances online decision-making under censored feedback and has potential extensions to contextual pricing, unbounded-noise scenarios, and non-linear demand, with significant implications for pricing strategies on perishable goods.

Abstract

We study an online dynamic pricing problem where the potential demand at each time period is stochastic and dependent on the price. However, a perishable inventory is imposed at the beginning of each time , censoring the potential demand if it exceeds the inventory level. To address this problem, we introduce a pricing algorithm based on the optimistic estimates of derivatives. We show that our algorithm achieves optimal regret even with adversarial inventory series. Our findings advance the state-of-the-art in online decision-making problems with censored feedback, offering a theoretically optimal solution against adversarial observations.

Paper Structure

This paper contains 31 sections, 6 theorems, 39 equations, 1 figure, 1 algorithm.

Key Result

Theorem 5.1

Let $\tau=\frac{1}{\sqrt{T}}$ in algo:C20CB. For any adversarial $\{\gamma_t\}_{t=1}^T$ input sequence, C20CB suffers at most $\tilde{O}\left(\sqrt{T}\cdot\log\frac{T}{\delta}\right)$ regret, with probability $\Pr\geq 1-\delta$.

Figures (1)

  • Figure 1: The price which C20CB proposes based on confidence bounds of $\hat{r}_{k,t}$: (a) If there exist prices whose error bar contain $0$, then we propose the largest price among them. (b) If no error bar contains $0$ but there does exist at least one below $0$, we propose the price whose corresponding error bar is closest to $0$. (c) If all error bars are above $0$, then we propose $\frac{\hat{a}}{2\hat{b}}$ for pure exploitation.

Theorems & Definitions (18)

  • Example 1.1: Performance Tickets
  • Example 1.2: Fruit Retails
  • Definition 3.1: Demand functions
  • Definition 3.2: Distributional functions
  • Definition 3.3: Revenue function
  • Definition 3.4: Regret
  • Theorem 5.1: Regret
  • proof
  • Lemma 5.2: revenue function $r_t(p)$
  • Lemma 5.3: Estimation error of $a$ and $b$
  • ...and 8 more