Competitive Demand Learning: A Non-cooperative Pricing Algorithm with Coordinated Price Experimentation
Yongge Yang, Yu-Ching Lee, Po-An Chen
TL;DR
This paper addresses dynamic, competitive pricing when firms face unknown demand. It introduces the Coordinated Demand Learning (CDL) algorithm, which orchestrates price experiments across firms to reveal demand responses while steering prices toward the clairvoyant Nash equilibrium. The authors establish convergence via contraction properties, and derive sublinear regret bounds of $O(F\sqrt{T})$ and revenue-difference bounds of $O(F^2 T^{3/4})$, with extensions to partially clairvoyant settings. Numerical experiments on linear and multinomial logit demands corroborate the theoretical guarantees and illustrate how learning costs scale with the number of firms. The work offers a practical mechanism for platform-mediated, information-rich, coordinated pricing in competitive markets with unknown demand functions.
Abstract
We consider a periodical equilibrium pricing problem for multiple firms over a planning horizon of T periods. At each period, firms set their selling prices and receive stochastic demand from consumers. Firms do not know their underlying demand curve, but they wish to determine the selling prices to maximize total revenue under competition. Hence, they have to do some price experiments such that the observed demand data are informative to make price decisions. However, uncoordinated price updating can render the demand information gathered by price experimentation less informative or inaccurate. We design a nonparametric learning algorithm to facilitate coordinated dynamic pricing, in which competitive firms estimate their demand functions based on observations and adjust their pricing strategies in a prescribed manner. We show that the pricing decisions, determined by estimated demand functions, converge to underlying equilibrium as time progresses. We obtain a bound of the revenue difference that has an order of O(F^2 T^3/4) and a regret bound that has an order of O(F T^1/2) with respect to the number of the competitive firms F and T . We also develop a modified algorithm to handle the situation where some firms may have the knowledge of the demand curve.
