Can time series forecasting be automated? A benchmark and analysis
Anvitha Thirthapura Sreedhara, Joaquin Vanschoren
TL;DR
The paper introduces a comprehensive benchmark for time series forecasting across diverse datasets drawn from the Monash Time Series Forecasting Repository to systematically evaluate AutoGluon-Timeseries and sktime. It analyzes both frameworks under standardized training budgets and a tuned sktime pipeline, using SMAPE and MASE as primary metrics. Key findings show AutoGluon–Timeseries generally achieving stronger, more robust performance across frequencies and domains, with PatchTST and ensembling frequently contributing to gains, while certain sktime methods excel in specific settings. The work demonstrates the value of automated forecasting pipelines for method selection and highlights avenues for extending AutoML-based forecasting with richer datasets, meta-models, and transfer learning for improved deployment in practice.
Abstract
In the field of machine learning and artificial intelligence, time series forecasting plays a pivotal role across various domains such as finance, healthcare, and weather. However, the task of selecting the most suitable forecasting method for a given dataset is a complex task due to the diversity of data patterns and characteristics. This research aims to address this challenge by proposing a comprehensive benchmark for evaluating and ranking time series forecasting methods across a wide range of datasets. This study investigates the comparative performance of many methods from two prominent time series forecasting frameworks, AutoGluon-Timeseries, and sktime to shed light on their applicability in different real-world scenarios. This research contributes to the field of time series forecasting by providing a robust benchmarking methodology and facilitating informed decision-making when choosing forecasting methods for achieving optimal prediction.
