Table of Contents
Fetching ...

Online Bidding under RoS Constraints without Knowing the Value

Sushant Vijayan, Zhe Feng, Swati Padmanabhan, Karthikeyan Shanmugam, Arun Suggala, Di Wang

TL;DR

This work tackles online bidding under Return-on-Spend (RoS) and budget constraints when the value of each impression is unknown. It introduces a UCB-style algorithm (UCB-RoS) that maintains confidence sets for allocation, pricing, and value estimates, selecting bids to maximize value while respecting long-term constraints. The theoretical contributions establish near-optimal regret and constraint-violation bounds in the stochastic setting, with favorable dependence on the bid space and without relying on Slater-point assumptions. Empirical results on synthetic data corroborate strong performance, demonstrating improved regret trade-offs over prior approaches and practical computational efficiency for real-time bidding scenarios.

Abstract

We consider the problem of bidding in online advertising, where an advertiser aims to maximize value while adhering to budget and Return-on-Spend (RoS) constraints. Unlike prior work that assumes knowledge of the value generated by winning each impression ({e.g.,} conversions), we address the more realistic setting where the advertiser must simultaneously learn the optimal bidding strategy and the value of each impression opportunity. This introduces a challenging exploration-exploitation dilemma: the advertiser must balance exploring different bids to estimate impression values with exploiting current knowledge to bid effectively. To address this, we propose a novel Upper Confidence Bound (UCB)-style algorithm that carefully manages this trade-off. Via a rigorous theoretical analysis, we prove that our algorithm achieves $\widetilde{O}(\sqrt{T\log(|\mathcal{B}|T)})$ regret and constraint violation, where $T$ is the number of bidding rounds and $\mathcal{B}$ is the domain of possible bids. This establishes the first optimal regret and constraint violation bounds for bidding in the online setting with unknown impression values. Moreover, our algorithm is computationally efficient and simple to implement. We validate our theoretical findings through experiments on synthetic data, demonstrating that our algorithm exhibits strong empirical performance compared to existing approaches.

Online Bidding under RoS Constraints without Knowing the Value

TL;DR

This work tackles online bidding under Return-on-Spend (RoS) and budget constraints when the value of each impression is unknown. It introduces a UCB-style algorithm (UCB-RoS) that maintains confidence sets for allocation, pricing, and value estimates, selecting bids to maximize value while respecting long-term constraints. The theoretical contributions establish near-optimal regret and constraint-violation bounds in the stochastic setting, with favorable dependence on the bid space and without relying on Slater-point assumptions. Empirical results on synthetic data corroborate strong performance, demonstrating improved regret trade-offs over prior approaches and practical computational efficiency for real-time bidding scenarios.

Abstract

We consider the problem of bidding in online advertising, where an advertiser aims to maximize value while adhering to budget and Return-on-Spend (RoS) constraints. Unlike prior work that assumes knowledge of the value generated by winning each impression ({e.g.,} conversions), we address the more realistic setting where the advertiser must simultaneously learn the optimal bidding strategy and the value of each impression opportunity. This introduces a challenging exploration-exploitation dilemma: the advertiser must balance exploring different bids to estimate impression values with exploiting current knowledge to bid effectively. To address this, we propose a novel Upper Confidence Bound (UCB)-style algorithm that carefully manages this trade-off. Via a rigorous theoretical analysis, we prove that our algorithm achieves regret and constraint violation, where is the number of bidding rounds and is the domain of possible bids. This establishes the first optimal regret and constraint violation bounds for bidding in the online setting with unknown impression values. Moreover, our algorithm is computationally efficient and simple to implement. We validate our theoretical findings through experiments on synthetic data, demonstrating that our algorithm exhibits strong empirical performance compared to existing approaches.

Paper Structure

This paper contains 21 sections, 8 theorems, 18 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

We propose an algorithm (algo_ucb_stoch) designed for value maximization in online advertising auctions with return-on-spend (RoS) and budget constraints, without any prior knowledge of the values associated with incoming user queries. In the stochastic setting described earlier, with an online hori

Figures (2)

  • Figure 1: Figures (a), (b) show the distribution over bids for the competing bidders. Figure (c) shows the expected pricing, expected value and budget curves over the bids.
  • Figure 2: Comparison between UCB-RoS (in yellow), castiglioni2022unifying (in green), and bernasconi2024beyond (in blue).

Theorems & Definitions (9)

  • Theorem : Informal; see \ref{['thm:main_theorem']}
  • lemma 1
  • lemma 2
  • Theorem 3.1
  • Proposition 3.1
  • Remark 3.1: Extension to linear bandits
  • Theorem 3.2: High Probability bounds
  • lemma 3
  • Theorem