Table of Contents
Fetching ...

A Unified Multi-Task Learning Framework for Generative Auto-Bidding with Validation-Aligned Optimization

Yiqin Lv, Zhiyu Mou, Miao Xu, Jinghao Chen, Qi Wang, Yixiu Mao, Yun Qu, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng, Xiangyang Ji

TL;DR

This work tackles generalization under distribution shifts in online auto-bidding with multiple tasks. It introduces Validation-Aligned Multi-task Optimization (VAMO), which adaptively weights tasks by aligning each task's training gradient with a held-out validation gradient, using weights $w_i^* = \mathrm{softmax}(m_i/\lambda)$ where $m_i = (\langle g_i^{\mathrm{val}}, g_{i,1}^{\mathrm{train}}\rangle, ..., \langle g_i^{\mathrm{val}}, g_{i,K}^{\mathrm{train}}\rangle)$. The framework is coupled with a periodicity-aware TimesNet based temporal module in a unified generative auto-bidding backbone to enable cross-task transfer of seasonal structure. The authors provide a convergence analysis showing a sublinear rate $O(1/I)$ to a stationary point under mild assumptions and validate the approach with extensive simulations and large-scale real-world data, achieving significant improvements over baselines. This work offers a practical, theory-grounded approach for robust multi-task auto-bidding in nonstationary online advertising environments.

Abstract

In online advertising, heterogeneous advertiser requirements give rise to numerous customized bidding tasks that are typically optimized independently, resulting in extensive computation and limited data efficiency. Multi-task learning offers a principled framework to train these tasks jointly through shared representations. However, existing multi-task optimization strategies are primarily guided by training dynamics and often generalize poorly in volatile bidding environments. To this end, we present Validation-Aligned Multi-task Optimization (VAMO), which adaptively assigns task weights based on the alignment between per-task training gradients and a held-out validation gradient, thereby steering updates toward validation improvement and better matching deployment objectives. We further equip the framework with a periodicity-aware temporal module and couple it with an advanced generative auto-bidding backbone to enhance cross-task transfer of seasonal structure and strengthen bidding performance. Meanwhile, we provide theoretical insights into the proposed method, e.g., convergence guarantee and alignment analysis. Extensive experiments on both simulated and large-scale real-world advertising systems consistently demonstrate significant improvements over typical baselines, illuminating the effectiveness of the proposed approach.

A Unified Multi-Task Learning Framework for Generative Auto-Bidding with Validation-Aligned Optimization

TL;DR

This work tackles generalization under distribution shifts in online auto-bidding with multiple tasks. It introduces Validation-Aligned Multi-task Optimization (VAMO), which adaptively weights tasks by aligning each task's training gradient with a held-out validation gradient, using weights where . The framework is coupled with a periodicity-aware TimesNet based temporal module in a unified generative auto-bidding backbone to enable cross-task transfer of seasonal structure. The authors provide a convergence analysis showing a sublinear rate to a stationary point under mild assumptions and validate the approach with extensive simulations and large-scale real-world data, achieving significant improvements over baselines. This work offers a practical, theory-grounded approach for robust multi-task auto-bidding in nonstationary online advertising environments.

Abstract

In online advertising, heterogeneous advertiser requirements give rise to numerous customized bidding tasks that are typically optimized independently, resulting in extensive computation and limited data efficiency. Multi-task learning offers a principled framework to train these tasks jointly through shared representations. However, existing multi-task optimization strategies are primarily guided by training dynamics and often generalize poorly in volatile bidding environments. To this end, we present Validation-Aligned Multi-task Optimization (VAMO), which adaptively assigns task weights based on the alignment between per-task training gradients and a held-out validation gradient, thereby steering updates toward validation improvement and better matching deployment objectives. We further equip the framework with a periodicity-aware temporal module and couple it with an advanced generative auto-bidding backbone to enhance cross-task transfer of seasonal structure and strengthen bidding performance. Meanwhile, we provide theoretical insights into the proposed method, e.g., convergence guarantee and alignment analysis. Extensive experiments on both simulated and large-scale real-world advertising systems consistently demonstrate significant improvements over typical baselines, illuminating the effectiveness of the proposed approach.

Paper Structure

This paper contains 18 sections, 3 theorems, 37 equations, 4 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

Let $m_{i,k} \triangleq \langle \bm g_i^{\mathrm{val}},\, \bm g_{i,k}^{\mathrm{train}}\rangle$ and $m_i=(m_{i,1},\dots,m_{i,K})^\top\in\mathbb{R}^K$. Let $\bm d_i^\star \in \arg\max_{\mathbf{w}\in \Delta^K} \langle \bm g_i^{\mathrm{val}}, \sum_k w_{i,k} \bm g_{i,k}^{\mathrm{train}}\rangle$, $d_i=\su

Figures (4)

  • Figure 1: Periodic patterns in nonstationary environments. The bidding environments exhibit nonstationary dynamics with recurring temporal structures, such as diurnal periodicity.
  • Figure 2: The overall flowchart of VAMO and multi-task learning architectures. The online-generated dataset is partitioned into a training and a validation dataset over time, where distribution shift probably happens when the bidding environment changes a lot. The neural architecture extracts the shared information and constitutes task-specific generative auto-bidding modules. VAMO learns to align with the objective of the shifted test environment while balancing multi-task performance.
  • Figure 3: Ablation on validation signal. Error bars denote the standard deviation (3 runs).
  • Figure 4: Ablation on $\lambda$. The error bars denote the standard deviation (3 runs).

Theorems & Definitions (5)

  • Lemma 1: Maximal alignment among convex combinations
  • Theorem 1: Convergence
  • Corollary 1
  • proof
  • proof