Table of Contents
Fetching ...

Artificial Intelligence for Multi-Unit Auction design

Peyman Khezr, Kendall Taylor

TL;DR

This paper tackles the challenge of understanding bidding behavior in multi-unit auctions where theory is limited. It proposes a model-free reinforcement learning framework to simulate bidding across three practical sealed-bid formats (Discriminatory, Generalized Second-Price, and Uniform-Price) using six algorithms from Q-Learning, Policy Gradient, and Actor-Critic families. Key findings show that Proximal Policy Optimization (PPO) delivers the most stable learning and highest payoffs, while revenue and efficiency rankings vary by auction type and item count; Uniform-Price often yields the best efficiency, with Generalized Second-Price providing strong stability in revenue and efficiency. The study demonstrates the potential of AI-driven auction design and provides a robust baseline for comparing future auction variants using RL, with practical implications for designing more efficient and revenue-robust multi-unit auctions.

Abstract

Understanding bidding behavior in multi-unit auctions remains an ongoing challenge for researchers. Despite their widespread use, theoretical insights into the bidding behavior, revenue ranking, and efficiency of commonly used multi-unit auctions are limited. This paper utilizes artificial intelligence, specifically reinforcement learning, as a model free learning approach to simulate bidding in three prominent multi-unit auctions employed in practice. We introduce six algorithms that are suitable for learning and bidding in multi-unit auctions and compare them using an illustrative example. This paper underscores the significance of using artificial intelligence in auction design, particularly in enhancing the design of multi-unit auctions.

Artificial Intelligence for Multi-Unit Auction design

TL;DR

This paper tackles the challenge of understanding bidding behavior in multi-unit auctions where theory is limited. It proposes a model-free reinforcement learning framework to simulate bidding across three practical sealed-bid formats (Discriminatory, Generalized Second-Price, and Uniform-Price) using six algorithms from Q-Learning, Policy Gradient, and Actor-Critic families. Key findings show that Proximal Policy Optimization (PPO) delivers the most stable learning and highest payoffs, while revenue and efficiency rankings vary by auction type and item count; Uniform-Price often yields the best efficiency, with Generalized Second-Price providing strong stability in revenue and efficiency. The study demonstrates the potential of AI-driven auction design and provides a robust baseline for comparing future auction variants using RL, with practical implications for designing more efficient and revenue-robust multi-unit auctions.

Abstract

Understanding bidding behavior in multi-unit auctions remains an ongoing challenge for researchers. Despite their widespread use, theoretical insights into the bidding behavior, revenue ranking, and efficiency of commonly used multi-unit auctions are limited. This paper utilizes artificial intelligence, specifically reinforcement learning, as a model free learning approach to simulate bidding in three prominent multi-unit auctions employed in practice. We introduce six algorithms that are suitable for learning and bidding in multi-unit auctions and compare them using an illustrative example. This paper underscores the significance of using artificial intelligence in auction design, particularly in enhancing the design of multi-unit auctions.
Paper Structure (21 sections, 8 equations, 5 figures, 6 tables)

This paper contains 21 sections, 8 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Bidder learning ratios for the Discriminatory Price (DP) auction comparing four, six and eight items on offer with six bidders, each demanding two items.
  • Figure 2: Bidder learning ratios for the Generalised Second-Price (GSP) auction comparing four, six and eight items on offer with six bidders, each demanding two items.
  • Figure 3: Bidder learning ratios for the Uniform-Price (UP) auction comparing four, six and eight items on offer with six bidders, each demanding two items.
  • Figure 4: Comparison of auction performance metrics for the Discriminatory Price (DP), Generalised Second Price (GSP), and Uniform Price (UP) auctions with four, six and eight items on offer. Each auction comprised six bidders (six different reinforcement learning algorithms), each demanding two items.
  • Figure 5: PPO: Comparison of auction performance metrics for the Discriminatory Price (DP), Generalised Second Price (GSP), and Uniform Price (UP) auctions with four, six and eight items on offer. Each auction comprised six bidders (six PPO algorithms), each demanding two items.