Table of Contents
Fetching ...

Learning and Collusion in Multi-unit Auctions

Simina Brânzei, Mahsa Derakhshan, Negin Golrezaei, Yanjun Han

TL;DR

This work analyzes repeated multi-unit uniform-price auctions under two price formats: the $K$-th highest bid and the $(K+1)$-st highest bid. It introduces a polynomial-time offline method that reduces the bidder’s hindsight optimization to a maximum-weight path problem on a constructed DAG, and leverages this structure to design online bidding algorithms with sublinear regret, employing a path-kernel / weight-pushing approach that makes Hedge-style learning feasible despite exponentially many bid profiles. The authors provide regret bounds under both full-information and bandit feedback, plus matching lower bounds, and they analyze equilibrium properties via core concepts, showing a stark contrast between the two price formats: zero-price core-stable equilibria are robust in the $(K+1)$-st price setting (implying collusion risk), whereas the $K$-th price format avoids such equilibria. Overall, the paper contributes algorithmic mechanisms for bidding in repeated multi-unit auctions and sheds light on fundamental strategic differences between the two pricing schemes with implications for applications like carbon-license auctions.

Abstract

We consider repeated multi-unit auctions with uniform pricing, which are widely used in practice for allocating goods such as carbon licenses. In each round, $K$ identical units of a good are sold to a group of buyers that have valuations with diminishing marginal returns. The buyers submit bids for the units, and then a price $p$ is set per unit so that all the units are sold. We consider two variants of the auction, where the price is set to the $K$-th highest bid and $(K+1)$-st highest bid, respectively. We analyze the properties of this auction in both the offline and online settings. In the offline setting, we consider the problem that one player $i$ is facing: given access to a data set that contains the bids submitted by competitors in past auctions, find a bid vector that maximizes player $i$'s cumulative utility on the data set. We design a polynomial time algorithm for this problem, by showing it is equivalent to finding a maximum-weight path on a carefully constructed directed acyclic graph. In the online setting, the players run learning algorithms to update their bids as they participate in the auction over time. Based on our offline algorithm, we design efficient online learning algorithms for bidding. The algorithms have sublinear regret, under both full information and bandit feedback structures. We complement our online learning algorithms with regret lower bounds. Finally, we analyze the quality of the equilibria in the worst case through the lens of the core solution concept in the game among the bidders. We show that the $(K+1)$-st price format is susceptible to collusion among the bidders; meanwhile, the $K$-th price format does not have this issue.

Learning and Collusion in Multi-unit Auctions

TL;DR

This work analyzes repeated multi-unit uniform-price auctions under two price formats: the -th highest bid and the -st highest bid. It introduces a polynomial-time offline method that reduces the bidder’s hindsight optimization to a maximum-weight path problem on a constructed DAG, and leverages this structure to design online bidding algorithms with sublinear regret, employing a path-kernel / weight-pushing approach that makes Hedge-style learning feasible despite exponentially many bid profiles. The authors provide regret bounds under both full-information and bandit feedback, plus matching lower bounds, and they analyze equilibrium properties via core concepts, showing a stark contrast between the two price formats: zero-price core-stable equilibria are robust in the -st price setting (implying collusion risk), whereas the -th price format avoids such equilibria. Overall, the paper contributes algorithmic mechanisms for bidding in repeated multi-unit auctions and sheds light on fundamental strategic differences between the two pricing schemes with implications for applications like carbon-license auctions.

Abstract

We consider repeated multi-unit auctions with uniform pricing, which are widely used in practice for allocating goods such as carbon licenses. In each round, identical units of a good are sold to a group of buyers that have valuations with diminishing marginal returns. The buyers submit bids for the units, and then a price is set per unit so that all the units are sold. We consider two variants of the auction, where the price is set to the -th highest bid and -st highest bid, respectively. We analyze the properties of this auction in both the offline and online settings. In the offline setting, we consider the problem that one player is facing: given access to a data set that contains the bids submitted by competitors in past auctions, find a bid vector that maximizes player 's cumulative utility on the data set. We design a polynomial time algorithm for this problem, by showing it is equivalent to finding a maximum-weight path on a carefully constructed directed acyclic graph. In the online setting, the players run learning algorithms to update their bids as they participate in the auction over time. Based on our offline algorithm, we design efficient online learning algorithms for bidding. The algorithms have sublinear regret, under both full information and bandit feedback structures. We complement our online learning algorithms with regret lower bounds. Finally, we analyze the quality of the equilibria in the worst case through the lens of the core solution concept in the game among the bidders. We show that the -st price format is susceptible to collusion among the bidders; meanwhile, the -th price format does not have this issue.
Paper Structure (46 sections, 11 theorems, 72 equations, 1 figure, 2 algorithms)

This paper contains 46 sections, 11 theorems, 72 equations, 1 figure, 2 algorithms.

Key Result

Theorem 1

Computing an optimum bid vector for one player in the offline setting is equivalent to finding a maximum-weight path in a DAG and can be solved in polynomial time.

Figures (1)

  • Figure 1: Example of DAG as in Definition \ref{['def:graph_G']}. We have $K=4$ units and the set of candidate bids for player $i$ is $\mathcal{S}{}_i =\{0,1, 2\}$. The graph has a source $z_{-}$, a sink $z_{+}$, and nodes $z_{r,j}$ for all $r \in \mathcal{S}{}_i$ and $j \in [4]$. Edges are of the form $(z_{r,j}, z_{s,j+1})$ for all $j < 4$ and $r,s \in \mathcal{S}{}_i$ with $r \geq s$. There are also edges from the source $z_{-}$ to nodes $z_{j,1}$ and from nodes $z_{j,4}$ to the sink $z_{+}$$\forall$$j \in \mathcal{S}{}_i$.

Theorems & Definitions (28)

  • Example 1
  • Theorem 1: informal
  • Theorem 2: Full information feedback, upper bound
  • Theorem 3: Bandit feedback, upper bound
  • Theorem 4: Lower bound, full information and bandit feedback
  • Theorem 5: Core without transfers
  • Theorem 6: Core with transfers
  • Definition 1: The graph $G_i$
  • proof
  • proof : Proof of Theorem \ref{['thm:opt_offline']}
  • ...and 18 more