Time Series Data Augmentation as an Imbalanced Learning Problem

Vitor Cerqueira; Nuno Moniz; Ricardo Inácio; Carlos Soares

Time Series Data Augmentation as an Imbalanced Learning Problem

Vitor Cerqueira, Nuno Moniz, Ricardo Inácio, Carlos Soares

TL;DR

The paper tackles forecasting with collections of univariate time series, noting that global models require large datasets and can miss locally distinctive patterns. It reframes training as an imbalanced-domain learning problem and introduces Time Series Entity Resampling (TSER) to oversample toward a target time series using established resampling techniques like SMOTE. Through extensive experiments on seven datasets with 5502 time series, TSER variants often outperform both local and global baselines, demonstrating a favorable global-local trade-off for the target series, albeit with some degradation on other series. The work highlights a practical, targeted augmentation approach at the intersection of time-series forecasting and imbalanced learning, and outlines future extensions to multivariate settings and computational efficiency improvements.

Abstract

Recent state-of-the-art forecasting methods are trained on collections of time series. These methods, often referred to as global models, can capture common patterns in different time series to improve their generalization performance. However, they require large amounts of data that might not be readily available. Besides this, global models sometimes fail to capture relevant patterns unique to a particular time series. In these cases, data augmentation can be useful to increase the sample size of time series datasets. The main contribution of this work is a novel method for generating univariate time series synthetic samples. Our approach stems from the insight that the observations concerning a particular time series of interest represent only a small fraction of all observations. In this context, we frame the problem of training a forecasting model as an imbalanced learning task. Oversampling strategies are popular approaches used to deal with the imbalance problem in machine learning. We use these techniques to create synthetic time series observations and improve the accuracy of forecasting models. We carried out experiments using 7 different databases that contain a total of 5502 univariate time series. We found that the proposed solution outperforms both a global and a local model, thus providing a better trade-off between these two approaches.

Time Series Data Augmentation as an Imbalanced Learning Problem

TL;DR

Abstract

Paper Structure (26 sections, 4 equations, 5 figures, 5 tables)

This paper contains 26 sections, 4 equations, 5 figures, 5 tables.

Introduction
Background
Time Series Forecasting
Global Forecasting Models
Imbalanced Domain Learning
Tackling class imbalance
Resampling strategies
Time Series Entity Resampling
Data preparation
Resampling
Experiments
Datasets
Experimental Setup
Data Preparation
Evaluation
...and 11 more sections

Figures (5)

Figure 1: Workflow behind TSER. The collection of time series is transformed for supervised learning using mean normalization and time delay embedding. New synthetic samples are created using oversampling. The resulting dataset is used to build a model.
Figure 2: Average rank of each method across all time series.
Figure 3: Percentage difference in MASE between the respective method and each reference approach across all time series. Negative values denote better performance by the respective method.
Figure 4: Average difference in MASE between each approach when applied to the time series of interest and when it is applied to other time series of the same collection. Positive values denote a decrease in performance when the respective approach is applied in other time series.
Figure 5: Average rank of Global and TSER with varying sampling ratios.

Time Series Data Augmentation as an Imbalanced Learning Problem

TL;DR

Abstract

Time Series Data Augmentation as an Imbalanced Learning Problem

Authors

TL;DR

Abstract

Table of Contents

Figures (5)