Dynamic pricing with Bayesian updates from online reviews

José Correa; Mathieu Mari; Andrew Xia

Dynamic pricing with Bayesian updates from online reviews

José Correa, Mathieu Mari, Andrew Xia

TL;DR

The paper studies dynamic pricing under Bayesian learning from online reviews, modeling a seller's revenue as a Bayesian bandit with latent product quality and a prior $x$ updated by like/dislike signals. It shows that the posterior after a sequence of reviews depends only on the counts $(\ell, d)$, enabling a combinatorial analysis via Catalan numbers and a Gittins-index perspective to compute the stopping threshold $x^*$ and the value $V(x)$. Two complementary solution methods are developed: a fast dynamic programming approach over the discrete prior set and a closed-form combinatorial approach that yields explicit expressions for $x^*$ and $V(x)$. The results quantify when dynamic pricing improves learning and revenue relative to static pricing and extend to multi-valued or continuous quality spaces with a tractable dynamic program, informing practical pricing policies on review-rich platforms.

Abstract

When launching new products, firms face uncertainty about market reception. Online reviews provide valuable information not only to consumers but also to firms, allowing firms to adjust the product characteristics, including its selling price. In this paper, we consider a pricing model with online reviews in which the quality of the product is uncertain, and both the seller and the buyers Bayesianly update their beliefs to make purchasing & pricing decisions. We model the seller's pricing problem as a basic bandits' problem and show a close connection with the celebrated Catalan numbers, allowing us to efficiently compute the overall future discounted reward of the seller. With this tool, we analyze and compare the optimal static and dynamic pricing strategies in terms of the probability of effectively learning the quality of the product.

Dynamic pricing with Bayesian updates from online reviews

TL;DR

The paper studies dynamic pricing under Bayesian learning from online reviews, modeling a seller's revenue as a Bayesian bandit with latent product quality and a prior

updated by like/dislike signals. It shows that the posterior after a sequence of reviews depends only on the counts

, enabling a combinatorial analysis via Catalan numbers and a Gittins-index perspective to compute the stopping threshold

and the value

. Two complementary solution methods are developed: a fast dynamic programming approach over the discrete prior set and a closed-form combinatorial approach that yields explicit expressions for

and

. The results quantify when dynamic pricing improves learning and revenue relative to static pricing and extend to multi-valued or continuous quality spaces with a tractable dynamic program, informing practical pricing policies on review-rich platforms.

Abstract

Paper Structure (17 sections, 11 theorems, 37 equations, 5 figures)

This paper contains 17 sections, 11 theorems, 37 equations, 5 figures.

Introduction
Preliminaries
Our model.
Product.
Buyers.
Priors.
Our Results
The Bandits Connection
Related models
A Dynamic Programming Approach
Static price scenario.
A Combinatorial Approach
Computing $x^*$.
Computing $V(x)$.
Computing the optimal static price.
...and 2 more sections

Key Result

Lemma 1

Given a prior $x$, the updated prior after a sequence of $\ell$ likes and $d$ dislikes does not depend on the order and this value is $x_{\ell,d}=\frac{xp^\ell(1-p)^d}{ xp^\ell(1-p)^d + (1-x)q^\ell(1-q)^d}$. Furthermore, the probability of each such sequence only depends on $\ell$ and $d$.

Figures (5)

Figure 1: An example of computing $V(x)$, under dynamic prices, through our dynamic programming approach. Note that $x^* = 0.33$, which is lower than $x = 0.5 = \frac{c-q}{p-q}$, causes the local reward to be 0.
Figure 2: The number of paths from $A$ to $B$ along the grid, that do not enter the light red area (like the blue path but not like the green one) is the Catalan's quadrilateral number $C_3^{1,2}(9,4)=570$ (for comparison, the unconditional number of paths from $A$ to $B$ is ${{9+4} \choose 4}=715$ ). The slope of the boundary depends on parameters $a,b$. The classic Catalan's trapezoid arises when the boundary is horizontal.
Figure 3: The value of global expected reward depending on the fixed price $\pi$. In the symmetric setting (left), the blue points represent the revenue on the efficient frontier, which is the maximum possible price per discrete $x_\text{stop}$. The red points are not optimal because such prices yield the same number of possible net dislikes before the buyers stop buying. In the general setting (right), we see that revenue as a function of price is not as well-defined.
Figure 4: Seller's revenue depending on the static price. The red plot corresponds to the good/bad (binary) model with $p = 0.6, q = 0.4, c = 0.43, \delta = 0.99$, and the prior is such that the product is good/bad with probability 0.5. The blue plot corresponds to the extended model with $Q=[0.4, 0.6]$ and uniform prior distribution.
Figure 5: Seller's revenue as a function of the cost in the same instances as in \ref{['fig:fixed_price']}. The red function corresponds to the good/bad model, and the blue function corresponds to the extended model. On the left, we plot revenues resulting from the optimal static pricing (the price is optimized for each possible cost), and on the right, we plot revenues resulting from dynamic pricing.

Theorems & Definitions (22)

Lemma 1
proof
Proposition 2
proof
Corollary 3
Proposition 4
proof
Definition 5: Catalan's quadrilateral
Lemma 6
proof
...and 12 more

Dynamic pricing with Bayesian updates from online reviews

TL;DR

Abstract

Dynamic pricing with Bayesian updates from online reviews

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (22)