Table of Contents
Fetching ...

AutoTask: Task Aware Multi-Faceted Single Model for Multi-Task Ads Relevance

Shouchang Guo, Sonam Damani, Keng-hao Chang

TL;DR

This work tackles the challenge of multi-task ads relevance across diverse ad scenarios by reframing feature combination and cross-task interactions as a language modeling problem. It introduces a two-facet architecture: (1) Task Aware Feature Modeling with a novel Task ID encoding that appends task identity to feature dimensions, and (2) Cross Task Interaction Modeling using auto-regressive attention over task blocks. The model is a lightweight GPT-based transformer trained on all tasks but capable of single-task inference, with strong generalization to unseen tasks (e.g., Clinics and Home) and competitive performance against task-specific baselines. Training leverages both teacher-label distillation and LLM-labeled data, and the approach demonstrates improved ROC AUC and PR AUC across multiple ad scenarios, suggesting practical impact for scalable, maintainable multi-task relevance systems in online serving.

Abstract

Ads relevance models are crucial in determining the relevance between user search queries and ad offers, often framed as a classification problem. The complexity of modeling increases significantly with multiple ad types and varying scenarios that exhibit both similarities and differences. In this work, we introduce a novel multi-faceted attention model that performs task aware feature combination and cross task interaction modeling. Our technique formulates the feature combination problem as "language" modeling with auto-regressive attentions across both feature and task dimensions. Specifically, we introduce a new dimension of task ID encoding for task representations, thereby enabling precise relevance modeling across diverse ad scenarios with substantial improvement in generality capability for unseen tasks. We demonstrate that our model not only effectively handles the increased computational and maintenance demands as scenarios proliferate, but also outperforms generalized DNN models and even task-specific models across a spectrum of ad applications using a single unified model.

AutoTask: Task Aware Multi-Faceted Single Model for Multi-Task Ads Relevance

TL;DR

This work tackles the challenge of multi-task ads relevance across diverse ad scenarios by reframing feature combination and cross-task interactions as a language modeling problem. It introduces a two-facet architecture: (1) Task Aware Feature Modeling with a novel Task ID encoding that appends task identity to feature dimensions, and (2) Cross Task Interaction Modeling using auto-regressive attention over task blocks. The model is a lightweight GPT-based transformer trained on all tasks but capable of single-task inference, with strong generalization to unseen tasks (e.g., Clinics and Home) and competitive performance against task-specific baselines. Training leverages both teacher-label distillation and LLM-labeled data, and the approach demonstrates improved ROC AUC and PR AUC across multiple ad scenarios, suggesting practical impact for scalable, maintainable multi-task relevance systems in online serving.

Abstract

Ads relevance models are crucial in determining the relevance between user search queries and ad offers, often framed as a classification problem. The complexity of modeling increases significantly with multiple ad types and varying scenarios that exhibit both similarities and differences. In this work, we introduce a novel multi-faceted attention model that performs task aware feature combination and cross task interaction modeling. Our technique formulates the feature combination problem as "language" modeling with auto-regressive attentions across both feature and task dimensions. Specifically, we introduce a new dimension of task ID encoding for task representations, thereby enabling precise relevance modeling across diverse ad scenarios with substantial improvement in generality capability for unseen tasks. We demonstrate that our model not only effectively handles the increased computational and maintenance demands as scenarios proliferate, but also outperforms generalized DNN models and even task-specific models across a spectrum of ad applications using a single unified model.
Paper Structure (17 sections, 1 equation, 1 figure, 3 tables)

This paper contains 17 sections, 1 equation, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Multi-task classification with the proposed multi-faceted single attention model. Facet 1: task aware feature modeling enabled by introduced dimension of task ID encoding; Facet 2: cross task interaction modeling with task blocks. The embeddings from facets' trasnformers are fused to produce classification results.