Table of Contents
Fetching ...

Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity

Yige Wang, Jiashuo Jiang

Abstract

In this paper, we study how a budget-constrained bidder should learn to bid adaptively in repeated first-price auctions to maximize cumulative payoff. This problem arises from the recent industry-wide shift from second-price auctions to first-price auctions in display advertising, which renders truthful bidding suboptimal. We propose a simple dual-gradient-descent-based bidding policy that maintains a dual variable for the budget constraint as the bidder consumes the budget. We analyze two settings based on the bidder's knowledge of future private values: (i) an uninformative setting where all distributional knowledge (potentially non-stationary) is entirely unknown, and (ii) an informative setting where a prediction of budget allocation is available in advance. We characterize the performance loss (regret) relative to an optimal policy with complete information. For uninformative setting, we show that the regret is ~O(sqrt(T)) plus a Wasserstein-based variation term capturing non-stationarity, which is order-optimal. In the informative setting, the variation term can be eliminated using predictions, yielding a regret of ~O(sqrt(T)) plus the prediction error. Furthermore, we go beyond the global budget constraint by introducing a refined benchmark based on a per-period budget allocation plan, achieving exactly ~O(sqrt(T)) regret. We also establish robustness guarantees when the baseline policy deviates from the planned allocation, covering both ideal and adversarial deviations.

Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity

Abstract

In this paper, we study how a budget-constrained bidder should learn to bid adaptively in repeated first-price auctions to maximize cumulative payoff. This problem arises from the recent industry-wide shift from second-price auctions to first-price auctions in display advertising, which renders truthful bidding suboptimal. We propose a simple dual-gradient-descent-based bidding policy that maintains a dual variable for the budget constraint as the bidder consumes the budget. We analyze two settings based on the bidder's knowledge of future private values: (i) an uninformative setting where all distributional knowledge (potentially non-stationary) is entirely unknown, and (ii) an informative setting where a prediction of budget allocation is available in advance. We characterize the performance loss (regret) relative to an optimal policy with complete information. For uninformative setting, we show that the regret is ~O(sqrt(T)) plus a Wasserstein-based variation term capturing non-stationarity, which is order-optimal. In the informative setting, the variation term can be eliminated using predictions, yielding a regret of ~O(sqrt(T)) plus the prediction error. Furthermore, we go beyond the global budget constraint by introducing a refined benchmark based on a per-period budget allocation plan, achieving exactly ~O(sqrt(T)) regret. We also establish robustness guarantees when the baseline policy deviates from the planned allocation, covering both ideal and adversarial deviations.

Paper Structure

This paper contains 34 sections, 12 theorems, 84 equations, 3 figures, 1 algorithm.

Key Result

Lemma 2.1

For any $\mu \geq 0$, we have $V^{\mathrm{ LR}}(\mu) \geq V^{\mathrm{ OPT}}$. $\blacktriangleleft$$\blacktriangleleft$

Figures (3)

  • Figure 1: Relationship between Average Relative Error and Time Horizon
  • Figure 2: Relationship between Average Relative Error and Wasserstein Distance $\mathcal{W}_T$
  • Figure 3: Relationship between Average Relative Error and Prediction Error $V_T$

Theorems & Definitions (14)

  • Lemma 2.1: Weak Duality
  • Theorem 3.1
  • Corollary 3.2
  • Proposition 3.3
  • Lemma 4.1
  • Theorem 4.2
  • Proposition 4.3
  • Theorem 5.1
  • Theorem 5.2
  • Lemma A.1
  • ...and 4 more