An Adaptable Budget Planner for Enhancing Budget-Constrained Auto-Bidding in Online Advertising
Zhijian Duan, Yusen Huo, Tianyu Wang, Zhilin Zhang, Yeshu Li, Chuan Yu, Jian Xu, Bo Zheng, Xiaotie Deng
TL;DR
This work tackles budget-constrained auto-bidding in online advertising by introducing ABPlanner, a few-shot adaptable budget planner that sits above a low-level auto-bidder in a hierarchical framework. ABPlanner models each advertiser's bidding episode as an MDP where a high-level budget plan over $m$ stages guides fast, per-episode adaptation using prompts from previous episodes (in-context reinforcement learning) and is trained with PPO. Through extensive pure and semi-simulation experiments and real-world A/B testing, ABPlanner consistently improves the cumulative value of auto-bidders and demonstrates rapid adaptation to unseen advertisers. The results highlight the practical viability of integrating a learned, sample-efficient budget planner into real-time bidding stacks, with potential future work on joint learning with the auto-bidder and per-stage dynamic planning.
Abstract
In online advertising, advertisers commonly utilize auto-bidding services to bid for impression opportunities. A typical objective of the auto-bidder is to optimize the advertiser's cumulative value of winning impressions within specified budget constraints. However, such a problem is challenging due to the complex bidding environment faced by diverse advertisers. To address this challenge, we introduce ABPlanner, a few-shot adaptable budget planner designed to improve budget-constrained auto-bidding. ABPlanner is based on a hierarchical bidding framework that decomposes the bidding process into shorter, manageable stages. Within this framework, ABPlanner allocates the budget across all stages, allowing a low-level auto-bidder to bids based on the budget allocation plan. The adaptability of ABPlanner is achieved through a sequential decision-making approach, inspired by in-context reinforcement learning. For each advertiser, ABPlanner adjusts the budget allocation plan episode by episode, using data from previous episodes as prompt for current decisions. This enables ABPlanner to quickly adapt to different advertisers with few-shot data, providing a sample-efficient solution. Extensive simulation experiments and real-world A/B testing validate the effectiveness of ABPlanner, demonstrating its capability to enhance the cumulative value achieved by auto-bidders.
