Selling Joint Ads: A Regret Minimization Perspective
Gagan Aggarwal, Ashwinkumar Badanidiyuru, Paul Dütting, Federico Fusco
TL;DR
The paper tackles selling a single non-excludable good to two cooperating buyers in an online-learning setting, formalizing the Repeated Joint Ads problem and analyzing regret with respect to the best fixed DSIC/IR mechanism. It develops adaptive discretization approaches that transform the challenging mechanism space into tractable representations: orthogonal mechanisms for stochastic data and path-based discretizations for smooth adversaries. It delivers a stochastic upper bound of $\tilde{O}(T^{3/4})$ via Augment-the-Best-Mechanism and a $O(T^{2/3})$ bound via PathLearning in the $\sigma$-smooth regime, alongside lower bounds showing an $\Omega(\sqrt{T})$ baseline and an adversarial impossibility of sublinear regret; together they delineate a sharp separation between stochastic/smooth and adversarial settings. The work extends online-learning in economic design to non-excludable, multi-buyer settings, introduces new algorithmic tools (adaptive grids, augmented-grid mechanisms, and edge-based sampling for path-experts), and points to future extensions to larger coalitions and broader non-excludable mechanisms with potential real-world impact in collaborative advertising markets.
Abstract
Motivated by online retail, we consider the problem of selling one item (e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand). This problem captures, for example, situations where a merchant and a brand cooperatively bid in an auction to advertise a product, and both benefit from the ad being shown. A mechanism collects bids from the two and decides whether to allocate and which payments the two parties should make. This gives rise to intricate incentive compatibility constraints, e.g., on how to split payments between the two parties. We approach the problem of finding a revenue-maximizing incentive-compatible mechanism from an online learning perspective; this poses significant technical challenges. First, the action space (the class of all possible mechanisms) is huge; second, the function that maps mechanisms to revenue is highly irregular, ruling out standard discretization-based approaches. In the stochastic setting, we design an efficient learning algorithm achieving a regret bound of $O(T^{3/4})$. Our approach is based on an adaptive discretization scheme of the space of mechanisms, as any non-adaptive discretization fails to achieve sublinear regret. In the adversarial setting, we exploit the non-Lipschitzness of the problem to prove a strong negative result, namely that no learning algorithm can achieve more than half of the revenue of the best fixed mechanism in hindsight. We then consider the $σ$-smooth adversary; we construct an efficient learning algorithm that achieves a regret bound of $O(T^{2/3})$ and builds on a succinct encoding of exponentially many experts. Finally, we prove that no learning algorithm can achieve less than $Ω(\sqrt T)$ regret in both the stochastic and the smooth setting, thus narrowing the range where the minimax regret rates for these two problems lie.
