Improved learning rates in multi-unit uniform price auctions
Marius Potfer, Dorian Baudry, Hugo Richard, Vianney Perchet, Cheng Wan
TL;DR
This work studies online learning in repeated $K$-unit uniform price auctions with adversarial bids, motivated by electricity markets. It introduces a novel action-space $H_ extepsilon$ based on bid-gaps, enabling a decomposition of utility into independent sub-utilities and yielding improved bandit regret $ ilde{O}(K^{4/3}T^{2/3})$ (tight up to logs), plus a lower bound $ ilde{ ilde{O}}(T^{2/3})$. It also analyzes a richer all-winner feedback model, achieving $ ilde{O}(K^{5/2} oot 2 x ext{ }{ obreak T})$ regret, bridging the gap between bandit and full-information regimes. The paper unifies these results with known full-information rates and establishes a first-bandit lower bound in this auction setting, offering a principled, scalable approach for strategic bidding in repeated multi-unit auctions and informing market design in electricity contexts.
Abstract
Motivated by the strategic participation of electricity producers in electricity day-ahead market, we study the problem of online learning in repeated multi-unit uniform price auctions focusing on the adversarial opposing bid setting. The main contribution of this paper is the introduction of a new modeling of the bid space. Indeed, we prove that a learning algorithm leveraging the structure of this problem achieves a regret of $\tilde{O}(K^{4/3}T^{2/3})$ under bandit feedback, improving over the bound of $\tilde{O}(K^{7/4}T^{3/4})$ previously obtained in the literature. This improved regret rate is tight up to logarithmic terms. Inspired by electricity reserve markets, we further introduce a different feedback model under which all winning bids are revealed. This feedback interpolates between the full-information and bandit scenarios depending on the auctions' results. We prove that, under this feedback, the algorithm that we propose achieves regret $\tilde{O}(K^{5/2}\sqrt{T})$.
