Table of Contents
Fetching ...

Interpretable Deep Learning for Forecasting Online Advertising Costs: Insights from the Competitive Bidding Landscape

Fynn Oldenburg, Qiwei Han, Maximilian Kaiser

TL;DR

This study addresses the challenge of forecasting CPC in online advertising under a competitive bidding landscape by evaluating SARIMA, XGBoost, LSTM, and the Temporal Fusion Transformer (TFT) with covariates derived from cross-advertiser clustering. The TFT with distance-based competition covariates delivers the best multi-horizon accuracy and remains robust during shocks such as the COVID-19 pandemic, while providing interpretable feature importance and temporal attention. The work introduces a scalable covariate-selection framework from a large pool of advertisers and demonstrates that competition-aware multivariate models outperform univariate baselines and conventional tools like Google's Keyword Planner. Practically, the approach enables finer-grained budgeting decisions and strategic insights into competitive dynamics in digital advertising.

Abstract

As advertisers increasingly shift their budgets toward digital advertising, accurately forecasting advertising costs becomes essential for optimizing marketing campaign returns. This paper presents a comprehensive study that employs various time-series forecasting methods to predict daily average CPC in the online advertising market. We evaluate the performance of statistical models, machine learning techniques, and deep learning approaches, including the Temporal Fusion Transformer (TFT). Our findings reveal that incorporating multivariate models, enriched with covariates derived from competitors' CPC patterns through time-series clustering, significantly improves forecasting accuracy. We interpret the results by analyzing feature importance and temporal attention, demonstrating how the models leverage both the advertiser's data and insights from the competitive landscape. Additionally, our method proves robust during major market shifts, such as the COVID-19 pandemic, consistently outperforming models that rely solely on individual advertisers' data. This study introduces a scalable technique for selecting relevant covariates from a broad pool of advertisers, offering more accurate long-term forecasts and strategic insights into budget allocation and competitive dynamics in digital advertising.

Interpretable Deep Learning for Forecasting Online Advertising Costs: Insights from the Competitive Bidding Landscape

TL;DR

This study addresses the challenge of forecasting CPC in online advertising under a competitive bidding landscape by evaluating SARIMA, XGBoost, LSTM, and the Temporal Fusion Transformer (TFT) with covariates derived from cross-advertiser clustering. The TFT with distance-based competition covariates delivers the best multi-horizon accuracy and remains robust during shocks such as the COVID-19 pandemic, while providing interpretable feature importance and temporal attention. The work introduces a scalable covariate-selection framework from a large pool of advertisers and demonstrates that competition-aware multivariate models outperform univariate baselines and conventional tools like Google's Keyword Planner. Practically, the approach enables finer-grained budgeting decisions and strategic insights into competitive dynamics in digital advertising.

Abstract

As advertisers increasingly shift their budgets toward digital advertising, accurately forecasting advertising costs becomes essential for optimizing marketing campaign returns. This paper presents a comprehensive study that employs various time-series forecasting methods to predict daily average CPC in the online advertising market. We evaluate the performance of statistical models, machine learning techniques, and deep learning approaches, including the Temporal Fusion Transformer (TFT). Our findings reveal that incorporating multivariate models, enriched with covariates derived from competitors' CPC patterns through time-series clustering, significantly improves forecasting accuracy. We interpret the results by analyzing feature importance and temporal attention, demonstrating how the models leverage both the advertiser's data and insights from the competitive landscape. Additionally, our method proves robust during major market shifts, such as the COVID-19 pandemic, consistently outperforming models that rely solely on individual advertisers' data. This study introduces a scalable technique for selecting relevant covariates from a broad pool of advertisers, offering more accurate long-term forecasts and strategic insights into budget allocation and competitive dynamics in digital advertising.
Paper Structure (25 sections, 8 figures, 3 tables)

This paper contains 25 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: An illustrated example of daily ad clicks, cost and CPC from one advertiser
  • Figure 2: Study setup showing the resulting configurations of predictive models, feature compositions, and clustering methods. SARIMA is used as a benchmark model and is only tested in a univariate configuration. The best-performing configuration is the Temporal Fusion Transformer based on multivariate features, including competitors' data identified through distance-based clustering (highlighted in grey).
  • Figure 3: Feature composition setup for our three configurations (added component highlighted in grey).
  • Figure 4: Example of matching two smoothed advertiser time series using Dynamic Time Warping as a distance measure.
  • Figure 5: Sankey diagram of cluster assignments using the distance-based (left) and extracted-features-based approach (right) compared to the advertisers' native category.
  • ...and 3 more figures