Table of Contents
Fetching ...

Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback

Zeng Fu, Jiashuo Jiang, Yuan Zhou

Abstract

In this paper, we study the problem of learning to bid in repeated first-price auctions with budget constraints. In each period, the decision maker needs to submit a bid to win the auction and maximize the total collected reward, subject to a budget constraint throughout the horizon. We focus on the setting with one-sided information feedback where only the winning bid is revealed to the decision maker at each period. Different from previous papers that assume homogeneous competitors' bids, we assume that the highest bid of other bidders depends on the context of the impression, which is initially unknown and needs to be learned over time. To tackle the learning difficulty, we propose a novel robust regression method based on conditional quantile invariance to learn the contextual parameter. Further combined with a dual update procedure, we develop a new bidding algorithm and prove that our algorithm achieves $\widetilde{O}(\sqrt{T})$ regret, which is order-optimal. We further extend our approach to the multi-dimensional setting and demonstrate the practical efficiency of our algorithm through numerical experiments.

Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback

Abstract

In this paper, we study the problem of learning to bid in repeated first-price auctions with budget constraints. In each period, the decision maker needs to submit a bid to win the auction and maximize the total collected reward, subject to a budget constraint throughout the horizon. We focus on the setting with one-sided information feedback where only the winning bid is revealed to the decision maker at each period. Different from previous papers that assume homogeneous competitors' bids, we assume that the highest bid of other bidders depends on the context of the impression, which is initially unknown and needs to be learned over time. To tackle the learning difficulty, we propose a novel robust regression method based on conditional quantile invariance to learn the contextual parameter. Further combined with a dual update procedure, we develop a new bidding algorithm and prove that our algorithm achieves regret, which is order-optimal. We further extend our approach to the multi-dimensional setting and demonstrate the practical efficiency of our algorithm through numerical experiments.
Paper Structure (23 sections, 11 theorems, 111 equations, 1 figure, 4 algorithms)

This paper contains 23 sections, 11 theorems, 111 equations, 1 figure, 4 algorithms.

Key Result

Lemma 1

Under Assumption ass:lipschitz (Lipschitz continuity of $G$) and Assumption ass:identifiability (identifiability), for any $\delta \in (0,1)$, there exists a constant $C_1 > 0$ such that with probability at least $1 - \delta/2$,

Figures (1)

  • Figure 1: Performances of bidding algorithms with and without contextual highest other bids

Theorems & Definitions (11)

  • Lemma 1: Uniform Concentration of Sample Quantiles
  • Lemma 2: Estimation Error Bound
  • Theorem 1: Main Estimation Result
  • Theorem 2: Main Result
  • Lemma 3
  • Theorem 3: Main Result in Multi-dimension
  • Lemma 8: Optimal stationary strategy, restatement of Lemma \ref{['lem:Optimal stationary strategy']}
  • Lemma 9: Stopping time bound, restatement of Lemma \ref{['lem:stopping-time-bound']}
  • Lemma 10: Concentration of reward estimators, restatement of Lemma \ref{['lem:r-c-estimator-bound']}
  • Lemma 11: Restatement of Lemma \ref{['lem:regret-bound']}
  • ...and 1 more