Improved learning rates in multi-unit uniform price auctions

Marius Potfer; Dorian Baudry; Hugo Richard; Vianney Perchet; Cheng Wan

Improved learning rates in multi-unit uniform price auctions

Marius Potfer, Dorian Baudry, Hugo Richard, Vianney Perchet, Cheng Wan

TL;DR

This work studies online learning in repeated $K$-unit uniform price auctions with adversarial bids, motivated by electricity markets. It introduces a novel action-space $H_ extepsilon$ based on bid-gaps, enabling a decomposition of utility into independent sub-utilities and yielding improved bandit regret $ ilde{O}(K^{4/3}T^{2/3})$ (tight up to logs), plus a lower bound $ ilde{ ilde{O}}(T^{2/3})$. It also analyzes a richer all-winner feedback model, achieving $ ilde{O}(K^{5/2} oot 2 x ext{ }{ obreak T})$ regret, bridging the gap between bandit and full-information regimes. The paper unifies these results with known full-information rates and establishes a first-bandit lower bound in this auction setting, offering a principled, scalable approach for strategic bidding in repeated multi-unit auctions and informing market design in electricity contexts.

Abstract

Motivated by the strategic participation of electricity producers in electricity day-ahead market, we study the problem of online learning in repeated multi-unit uniform price auctions focusing on the adversarial opposing bid setting. The main contribution of this paper is the introduction of a new modeling of the bid space. Indeed, we prove that a learning algorithm leveraging the structure of this problem achieves a regret of $\tilde{O}(K^{4/3}T^{2/3})$ under bandit feedback, improving over the bound of $\tilde{O}(K^{7/4}T^{3/4})$ previously obtained in the literature. This improved regret rate is tight up to logarithmic terms. Inspired by electricity reserve markets, we further introduce a different feedback model under which all winning bids are revealed. This feedback interpolates between the full-information and bandit scenarios depending on the auctions' results. We prove that, under this feedback, the algorithm that we propose achieves regret $\tilde{O}(K^{5/2}\sqrt{T})$.

Improved learning rates in multi-unit uniform price auctions

TL;DR

This work studies online learning in repeated

-unit uniform price auctions with adversarial bids, motivated by electricity markets. It introduces a novel action-space

based on bid-gaps, enabling a decomposition of utility into independent sub-utilities and yielding improved bandit regret

(tight up to logs), plus a lower bound

. It also analyzes a richer all-winner feedback model, achieving

regret, bridging the gap between bandit and full-information regimes. The paper unifies these results with known full-information rates and establishes a first-bandit lower bound in this auction setting, offering a principled, scalable approach for strategic bidding in repeated multi-unit auctions and informing market design in electricity contexts.

Abstract

under bandit feedback, improving over the bound of

previously obtained in the literature. This improved regret rate is tight up to logarithmic terms. Inspired by electricity reserve markets, we further introduce a different feedback model under which all winning bids are revealed. This feedback interpolates between the full-information and bandit scenarios depending on the auctions' results. We prove that, under this feedback, the algorithm that we propose achieves regret

Paper Structure (30 sections, 20 theorems, 63 equations, 1 figure, 1 table, 2 algorithms)

This paper contains 30 sections, 20 theorems, 63 equations, 1 figure, 1 table, 2 algorithms.

Introduction
Auction rules
Repeated setting
Feedback
Related Work
Contribution
Action space
Motivation for an alternative representation
Action space tailored to the outcomes
Utility decomposition
Learning algorithms and guarantees
Algorithm for online learning in K-unit uniform auction
Estimators
Regret Analysis
Regret lower bound
...and 15 more sections

Key Result

Lemma 1

For each pseudo-bid $\mathbf{h} \in H_\epsilon$, there exists a unique $\mathbf{b} \in B_\epsilon$ such that $\mathbf{h}=\mathbf{h}_\mathbf{b}$. This therefore defines a bijective mapping between $H_\epsilon$ and $B_\epsilon$.

Figures (1)

Figure 1: Graph representation of action spaces $B_\epsilon$ (branzei_learning_2024) and $B(\mathcal{P}_\epsilon)$ (this paper)

Theorems & Definitions (39)

Remark 1
Lemma 1
proof : Proof
Corollary 1
Lemma 2
proof
Lemma 3
proof : Proof of Lemma \ref{['lemma : decomposition']}
Lemma 4
Definition 3.1: Estimators
...and 29 more

Improved learning rates in multi-unit uniform price auctions

TL;DR

Abstract

Improved learning rates in multi-unit uniform price auctions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (39)