Table of Contents
Fetching ...

IBMA: An Imputation-Based Mixup Augmentation Using Self-Supervised Learning for Time Series Data

Dang Nha Nguyen, Hai Dang Nguyen, Khoa Tho Anh Nguyen

TL;DR

This work tackles the limited augmentation strategies in long sequence time-series forecasting by introducing Imputation-based Mixup Augmentation (IMA), a two-phase framework built on Self-Supervised Reconstruction (SSR) and imputation-based augmentation. By combining imputed data with Mixup, the method creates diverse yet structurally coherent training samples, guiding forecasting models toward better generalization. Evaluations across DLinear, TimesNet, and iTransformer on ETTh1/ETTh2/ETTm1/ETTm2 show consistent improvements, achieving 22 out of 24 gains and 10 best-case results, with notable benefits from the imputation-based approach on the ETT datasets. The approach demonstrates robustness across architectures and datasets, suggesting a promising path for more effective augmentation in time-series forecasting, while also indicating model- and dataset-specific nuances.

Abstract

Data augmentation in time series forecasting plays a crucial role in enhancing model performance by introducing variability while maintaining the underlying temporal patterns. However, time series data offers fewer augmentation strategies compared to fields such as image or text, with advanced techniques like Mixup rarely being used. In this work, we propose a novel approach, Imputation-Based Mixup Augmentation (IBMA), which combines Imputation-Augmented data with Mixup augmentation to bolster model generalization and improve forecasting performance. We evaluate the effectiveness of this method across several forecasting models, including DLinear (MLP), TimesNet (CNN), and iTrainformer (Transformer), these models represent some of the most recent advances in time series forecasting. Our experiments, conducted on four datasets (ETTh1, ETTh2, ETTm1, ETTm2) and compared against eight other augmentation techniques, demonstrate that IBMA consistently enhances performance, achieving 22 improvements out of 24 instances, with 10 of those being the best performances, particularly with iTrainformer imputation.

IBMA: An Imputation-Based Mixup Augmentation Using Self-Supervised Learning for Time Series Data

TL;DR

This work tackles the limited augmentation strategies in long sequence time-series forecasting by introducing Imputation-based Mixup Augmentation (IMA), a two-phase framework built on Self-Supervised Reconstruction (SSR) and imputation-based augmentation. By combining imputed data with Mixup, the method creates diverse yet structurally coherent training samples, guiding forecasting models toward better generalization. Evaluations across DLinear, TimesNet, and iTransformer on ETTh1/ETTh2/ETTm1/ETTm2 show consistent improvements, achieving 22 out of 24 gains and 10 best-case results, with notable benefits from the imputation-based approach on the ETT datasets. The approach demonstrates robustness across architectures and datasets, suggesting a promising path for more effective augmentation in time-series forecasting, while also indicating model- and dataset-specific nuances.

Abstract

Data augmentation in time series forecasting plays a crucial role in enhancing model performance by introducing variability while maintaining the underlying temporal patterns. However, time series data offers fewer augmentation strategies compared to fields such as image or text, with advanced techniques like Mixup rarely being used. In this work, we propose a novel approach, Imputation-Based Mixup Augmentation (IBMA), which combines Imputation-Augmented data with Mixup augmentation to bolster model generalization and improve forecasting performance. We evaluate the effectiveness of this method across several forecasting models, including DLinear (MLP), TimesNet (CNN), and iTrainformer (Transformer), these models represent some of the most recent advances in time series forecasting. Our experiments, conducted on four datasets (ETTh1, ETTh2, ETTm1, ETTm2) and compared against eight other augmentation techniques, demonstrate that IBMA consistently enhances performance, achieving 22 improvements out of 24 instances, with 10 of those being the best performances, particularly with iTrainformer imputation.

Paper Structure

This paper contains 9 sections, 9 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Key Milestones in Time Series Forecasting math12101504
  • Figure 2: Illustration of the proposed data augmentation framework, comprising two key phases: Self-Supervised Reconstruction (SSR) for learning intrinsic data patterns and Imputed-based Mixup Augmentation (IMA) for enhancing data diversity and model generalization.
  • Figure 3: Data masking strategy: applying a binary mask to generate masked inputs.
  • Figure 4: Mixup applied to two imputed samples.
  • Figure 5: Comparison of the number of improvement cases and the best-case performance among eight augmentation methods, IA, and IMA on the ETT dataset.
  • ...and 1 more figures