No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints

Gagan Aggarwal; Giannis Fikioris; Mingfei Zhao

No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints

Gagan Aggarwal, Giannis Fikioris, Mingfei Zhao

TL;DR

The paper tackles online autobidding in non-truthful auctions with strict budget and ROI constraints, modeling $T$ sequential auctions where $(v_t,d_t)$ are drawn from an unknown distribution and the platform may mix first- and second-price formats. It develops a primal-dual online-learning framework against budget and ROI constraints, using a Lipschitz class of value-to-bid functions as the benchmark and a tree-based discretization to achieve near-optimal regret $\tilde{O}(\sqrt{T})$ under full information. The authors also show fundamental gaps between full-information and bandit settings, proving an $\Omega(T^{2/3})$ lower bound for bandit first-price auctions and providing a $\tilde{O}(T^{3/4})$ upper bound under known distribution assumptions, along with a polynomial-time full-information algorithm that preserves the guarantees when $v_t$ and $d_t$ are independent. Exact ROI satisfaction is achieved via a reduction that trades some rounds for a strictly ROI-feasible plan, strengthening practical applicability of the learn-and-allocate approach in real bidding platforms. Overall, the work advances theoretical understanding and practical design of robust autobidding under complex constraints in non-truthful auctions.

Abstract

Advertisers increasingly use automated bidding to optimize their ad campaigns on online advertising platforms. Autobidding optimizes an advertiser's objective subject to various constraints, e.g. average ROI and budget constraints. In this paper, we study the problem of designing online autobidding algorithms to optimize value subject to ROI and budget constraints when the platform is running any mixture of first and second price auction. We consider the following stochastic setting: There is an item for sale in each of $T$ rounds. In each round, buyers submit bids and an auction is run to sell the item. We focus on one buyer, possibly with budget and ROI constraints. We assume that the buyer's value and the highest competing bid are drawn i.i.d. from some unknown (joint) distribution in each round. We design a low-regret bidding algorithm that satisfies the buyer's constraints. Our benchmark is the objective value achievable by the best possible Lipschitz function that maps values to bids, which is rich enough to best respond to many different correlation structures between value and highest competing bid. Our main result is an algorithm with full information feedback that guarantees a near-optimal $\tilde O(\sqrt T)$ regret with respect to the best Lipschitz function. Our result applies to a wide range of auctions, most notably any mixture of first and second price auctions (price is a convex combination of the first and second price). In addition, our result holds for both value-maximizing buyers and quasi-linear utility-maximizing buyers. We also study the bandit setting, where we show an $Ω(T^{2/3})$ lower bound on the regret for first-price auctions, showing a large disparity between the full information and bandit settings. We also design an algorithm with $\tilde O(T^{3/4})$ regret, when the value distribution is known and is independent of the highest competing bid.

No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints

TL;DR

The paper tackles online autobidding in non-truthful auctions with strict budget and ROI constraints, modeling

sequential auctions where

are drawn from an unknown distribution and the platform may mix first- and second-price formats. It develops a primal-dual online-learning framework against budget and ROI constraints, using a Lipschitz class of value-to-bid functions as the benchmark and a tree-based discretization to achieve near-optimal regret

under full information. The authors also show fundamental gaps between full-information and bandit settings, proving an

lower bound for bandit first-price auctions and providing a

upper bound under known distribution assumptions, along with a polynomial-time full-information algorithm that preserves the guarantees when

and

are independent. Exact ROI satisfaction is achieved via a reduction that trades some rounds for a strictly ROI-feasible plan, strengthening practical applicability of the learn-and-allocate approach in real bidding platforms. Overall, the work advances theoretical understanding and practical design of robust autobidding under complex constraints in non-truthful auctions.

Abstract

rounds. In each round, buyers submit bids and an auction is run to sell the item. We focus on one buyer, possibly with budget and ROI constraints. We assume that the buyer's value and the highest competing bid are drawn i.i.d. from some unknown (joint) distribution in each round. We design a low-regret bidding algorithm that satisfies the buyer's constraints. Our benchmark is the objective value achievable by the best possible Lipschitz function that maps values to bids, which is rich enough to best respond to many different correlation structures between value and highest competing bid. Our main result is an algorithm with full information feedback that guarantees a near-optimal

regret with respect to the best Lipschitz function. Our result applies to a wide range of auctions, most notably any mixture of first and second price auctions (price is a convex combination of the first and second price). In addition, our result holds for both value-maximizing buyers and quasi-linear utility-maximizing buyers. We also study the bandit setting, where we show an

lower bound on the regret for first-price auctions, showing a large disparity between the full information and bandit settings. We also design an algorithm with

regret, when the value distribution is known and is independent of the highest competing bid.

Paper Structure (26 sections, 21 theorems, 138 equations, 1 figure, 4 algorithms)

This paper contains 26 sections, 21 theorems, 138 equations, 1 figure, 4 algorithms.

Introduction
Lagrangian Maximization in Non-Truthful Auctions
No-Regret Primal Algorithm against Adaptive Adversary and Time-Varying Range
From Standard Regret to Interval Regret
Related work
Preliminaries
Primal algorithm designs with full information
Overview of the Primal Algorithm Design
Time-varying Reward Ranges and Good Actions
Tree Algorithm for Learning to Bid with Lipschitz Functions
Reduction from Regret to Interval Regret
Exact satisfaction of the ROI constraint
Bandit information
Regret Lower Bound for Bandit Information in First-price Auctions
Regret Upper Bound
...and 11 more sections

Key Result

Theorem 1.1

There is an algorithm that achieves $\tilde{O}(\sqrt{T})$$\tilde{O}(\sqrt{T}) = O(\sqrt{T}\cdot \text{poly}(\log(T)))$. regret while satisfying both the budget and ROI constraints, with respect to the best Lipschitz bidding function given the knowledge of the distribution. The result applies to vari

Figures (1)

Figure 1: The algorithm structure of the entire primal/dual framework in our setting.

Theorems & Definitions (37)

Theorem 1.1: Informal version of \ref{['thm:tight:main_tight']}
Theorem 1.2: Informal version of \ref{['thm:bandit_lower_bound']}
Theorem 1.3: Informal version of \ref{['thm:bandit_ub']}
Theorem 2.1: Theorem 6.9 of DBLP:journals/corr/CastiglioniCK23, adapted to auctions
Theorem 3.1
Theorem 3.2
Definition 3.1: Good action
Theorem 3.3
Theorem 3.4
Theorem 3.5
...and 27 more

No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints

TL;DR

Abstract

No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (37)