Improved Algorithms for Contextual Dynamic Pricing
Matilde Tullii, Solenne Gaucher, Nadav Merlis, Vianney Perchet
TL;DR
The paper tackles contextual dynamic pricing where a seller must maximize revenue by posting prices based on covariates while receiving only binary feedback. It introduces VAPE, a valuation-approximation and price-elimination framework that decouples learning the context-dependent valuation $g(x)$ from estimating the demand and shares information across contexts. In the linear valuation setting, VAPE achieves a minimax-optimal $ ilde{O}(T^{2/3})$ regret, and in the non-parametric Hölder setting it attains a rate of $ ilde{O}(T^{(d+2\beta)/(d+3\beta)})$ under mild Lipschitz noise assumptions; both results improve over prior bounds and rely on adaptive, cross-context learning. The work offers a principled scheme for contextual pricing with minimal regularity requirements, with implications for revenue management and related online learning problems under contextual feedback.
Abstract
In contextual dynamic pricing, a seller sequentially prices goods based on contextual information. Buyers will purchase products only if the prices are below their valuations. The goal of the seller is to design a pricing strategy that collects as much revenue as possible. We focus on two different valuation models. The first assumes that valuations linearly depend on the context and are further distorted by noise. Under minor regularity assumptions, our algorithm achieves an optimal regret bound of $\tilde{\mathcal{O}}(T^{2/3})$, improving the existing results. The second model removes the linearity assumption, requiring only that the expected buyer valuation is $β$-Hölder in the context. For this model, our algorithm obtains a regret $\tilde{\mathcal{O}}(T^{d+2β/d+3β})$, where $d$ is the dimension of the context space.
