Bandit Sequential Posted Pricing via Half-Concavity

Sahil Singla; Yifan Wang

Bandit Sequential Posted Pricing via Half-Concavity

Sahil Singla, Yifan Wang

TL;DR

This work studies bandit feedback forSequential Posted Pricing with $n$ buyers over $T$ rounds, establishing near-optimal regret bounds that depend on distribution regularity. By proving a half-concavity property for regular distributions, the authors extend a single-buyer learning framework to multiple sequential buyers, achieving $ ilde{O}( ext{poly}(n)\sqrt{T})$ regret in the regular case and $ ilde{O}( ext{poly}(n)\,T^{2/3})$ for general distributions. For general distributions, they discretize values and operate over price-sets, combining two-step learning with generalized half-concavity to maintain feasibility and control regret. A fundamental lower bound shows linear regret is unavoidable under adversarial valuations, underscoring the necessity of distributional assumptions. Overall, the paper advances learning-enabled posted pricing by exploiting revenue-curve structure and providing tight regret guarantees with practical learning primitives.

Abstract

Sequential posted pricing auctions are popular because of their simplicity in practice and their tractability in theory. A usual assumption in their study is that the Bayesian prior distributions of the buyers are known to the seller, while in reality these priors can only be accessed from historical data. To overcome this assumption, we study sequential posted pricing in the bandit learning model, where the seller interacts with $n$ buyers over $T$ rounds: In each round the seller posts $n$ prices for the $n$ buyers and the first buyer with a valuation higher than the price takes the item. The only feedback that the seller receives in each round is the revenue. Our main results obtain nearly-optimal regret bounds for single-item sequential posted pricing in the bandit learning model. In particular, we achieve an $\tilde{O}(\mathsf{poly}(n)\sqrt{T})$ regret for buyers with (Myerson's) regular distributions and an $\tilde{O}(\mathsf{poly}(n)T^{{2}/{3}})$ regret for buyers with general distributions, both of which are tight in the number of rounds $T$. Our result for regular distributions was previously not known even for the single-buyer setting and relies on a new half-concavity property of the revenue function in the value space. For $n$ sequential buyers, our technique is to run a generalized single-buyer algorithm for all the buyers and to carefully bound the regret from the sub-optimal pricing of the suffix buyers.

Bandit Sequential Posted Pricing via Half-Concavity

TL;DR

This work studies bandit feedback forSequential Posted Pricing with

buyers over

rounds, establishing near-optimal regret bounds that depend on distribution regularity. By proving a half-concavity property for regular distributions, the authors extend a single-buyer learning framework to multiple sequential buyers, achieving

regret in the regular case and

for general distributions. For general distributions, they discretize values and operate over price-sets, combining two-step learning with generalized half-concavity to maintain feasibility and control regret. A fundamental lower bound shows linear regret is unavoidable under adversarial valuations, underscoring the necessity of distributional assumptions. Overall, the paper advances learning-enabled posted pricing by exploiting revenue-curve structure and providing tight regret guarantees with practical learning primitives.

Abstract

buyers over

rounds: In each round the seller posts

prices for the

buyers and the first buyer with a valuation higher than the price takes the item. The only feedback that the seller receives in each round is the revenue. Our main results obtain nearly-optimal regret bounds for single-item sequential posted pricing in the bandit learning model. In particular, we achieve an

regret for buyers with (Myerson's) regular distributions and an

regret for buyers with general distributions, both of which are tight in the number of rounds

. Our result for regular distributions was previously not known even for the single-buyer setting and relies on a new half-concavity property of the revenue function in the value space. For

sequential buyers, our technique is to run a generalized single-buyer algorithm for all the buyers and to carefully bound the regret from the sub-optimal pricing of the suffix buyers.

Paper Structure (30 sections, 24 theorems, 71 equations, 3 figures, 11 algorithms)

This paper contains 30 sections, 24 theorems, 71 equations, 3 figures, 11 algorithms.

Introduction
Model and Results
Techniques
Further Related Work
A Single Buyer with Regular Distribution
Half-Concavity
$\widetilde{O}(\sqrt{T})$ Regret for Half-Concave Functions: Proof of \ref{['thm:srmain']}
Main Sub-Routine: Proof of \ref{['thm:1RMain']} via Half-Concavity
Algorithm Overview.
Step 1: Find $\hat{p}$ to Approximate $p^*$
Step 2: Generating New Confidence Interval
$n$ Buyers with Regular Distributions
Proof Overview
Notation
$\widetilde{O}(\mathsf{poly}(n)\sqrt{T})$ Algorithm for $n$ Buyers: Proof of \ref{['thm:newGRMain']}
...and 15 more sections

Key Result

Theorem 2.3

For Bandit Posted Pricing with a single buyer, if the revenue function $R(p) := p\cdot (1 - F(p))$ is half-concave, then there exists an algorithm with $O(\sqrt{T} \log T)$ regret.

Figures (3)

Figure 1: An example of a half-concave function, where the function is Lipschitz and concave before $p^*$.
Figure 2: The two main steps of \ref{['thm:1RMain']}
Figure 3: Two cases of Step 1 as discussed in \ref{['alg:sr-esp']}.

Theorems & Definitions (70)

Definition 2.1: Regularity
Definition 2.2: Half-Concavity
Theorem 2.3
Lemma 2.4
Corollary 2.5: Corollary of \ref{['thm:srmain']} and \ref{['lma:1RHalfConSpecial']}
proof : Proof of \ref{['lma:1RHalfConSpecial']}
Lemma 2.6
proof : Proof of \ref{['thm:srmain']}
Lemma 2.7
proof : Proof of \ref{['lma:Step1']}
...and 60 more

Bandit Sequential Posted Pricing via Half-Concavity

TL;DR

Abstract

Bandit Sequential Posted Pricing via Half-Concavity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (70)