Constraint-Aware Generative Auto-bidding via Pareto-Prioritized Regret Optimization

Binglin Wu; Yingyi Zhang; Xianneng Li; Ruyue Deng; Chuan Yue; Weiru Zhang; Xiaoyi Zeng

Constraint-Aware Generative Auto-bidding via Pareto-Prioritized Regret Optimization

Binglin Wu, Yingyi Zhang, Xianneng Li, Ruyue Deng, Chuan Yue, Weiru Zhang, Xiaoyi Zeng

TL;DR

This work tackles constrained auto-bidding in dynamic advertising by addressing two key limitations of Decision Transformer-based approaches: missing cost-awareness in RTG conditioning and averaging-out behavior due to regression. It introduces PRO-Bid, which combines Constraint-Decoupled Pareto Representation (CDPR) with Counterfactual Regret Optimization (CRO) to enable constraint-aware sequence modeling and active policy improvement toward the Pareto frontier. CDPR decouples the budgeted problem into Return-to-Go $R_t$ and Cost-to-Go $C_t$ streams and emphasizes high-quality trajectories via Pareto-prioritized filtering, while CRO uses a global outcome predictor to identify superior counterfactuals and guides learning through regret-weighted regression toward better regimes. Offline results on AuctionNet and AuctionNet-Sparse, plus online A/B tests on AliExpress, show PRO-Bid achieves superior constraint satisfaction and value, with robust performance under data noise and dynamic CPA targets, indicating strong practical impact for large-scale constrained advertising systems.

Abstract

Auto-bidding systems aim to maximize marketing value while satisfying strict efficiency constraints such as Target Cost-Per-Action (CPA). Although Decision Transformers provide powerful sequence modeling capabilities, applying them to this constrained setting encounters two challenges: 1) standard Return-to-Go conditioning causes state aliasing by neglecting the cost dimension, preventing precise resource pacing; and 2) standard regression forces the policy to mimic average historical behaviors, thereby limiting the capacity to optimize performance toward the constraint boundary. To address these challenges, we propose PRO-Bid, a constraint-aware generative auto-bidding framework based on two synergistic mechanisms: 1) Constraint-Decoupled Pareto Representation (CDPR) decomposes global constraints into recursive cost and value contexts to restore resource perception, while reweighting trajectories based on the Pareto frontier to focus on high-efficiency data; and 2) Counterfactual Regret Optimization (CRO) facilitates active improvement by utilizing a global outcome predictor to identify superior counterfactual actions. By treating these high-utility outcomes as weighted regression targets, the model transcends historical averages to approach the optimal constraint boundary. Extensive experiments on two public benchmarks and online A/B tests demonstrate that PRO-Bid achieves superior constraint satisfaction and value acquisition compared to state-of-the-art baselines.

Constraint-Aware Generative Auto-bidding via Pareto-Prioritized Regret Optimization

TL;DR

and Cost-to-Go

streams and emphasizes high-quality trajectories via Pareto-prioritized filtering, while CRO uses a global outcome predictor to identify superior counterfactuals and guides learning through regret-weighted regression toward better regimes. Offline results on AuctionNet and AuctionNet-Sparse, plus online A/B tests on AliExpress, show PRO-Bid achieves superior constraint satisfaction and value, with robust performance under data noise and dynamic CPA targets, indicating strong practical impact for large-scale constrained advertising systems.

Abstract

Paper Structure (34 sections, 23 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 34 sections, 23 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Preliminaries
Problem Formulation of Auto-Bidding
Generative Bidding via DT
Framework
Constraint-Decoupled Pareto Representation
Dual-Stream Context Construction
Pareto-Prioritized Experience Filtering
Counterfactual Regret Optimization
Probabilistic Policy with Gaussian Head
Global Outcome Prediction
Full-Episode Constraint-Aware Utility
Optimization via Regret-Weighted Regression
Total Training Objective
Offline Experiment
...and 19 more sections

Figures (8)

Figure 1: Overall framework of PRO-Bid.
Figure 2: Comparison in different CPA constraint settings.
Figure 3: Performance under different noise augmentation.
Figure 4: Visualization of inference results on AuctionNet.
Figure 5: Online Auto-bidding System.
...and 3 more figures

Constraint-Aware Generative Auto-bidding via Pareto-Prioritized Regret Optimization

TL;DR

Abstract

Constraint-Aware Generative Auto-bidding via Pareto-Prioritized Regret Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (8)