Data Augmentation for Multivariate Time Series Classification: An Experimental Study

Romain Ilbert; Thai V. Hoang; Zonghua Zhang

Data Augmentation for Multivariate Time Series Classification: An Experimental Study

Romain Ilbert, Thai V. Hoang, Zonghua Zhang

TL;DR

The paper tackles data scarcity in multivariate time series classification by introducing a comprehensive taxonomy of augmentation techniques and evaluating their impact on ROCKET and InceptionTime across 13 imbalanced UCR/UEA datasets. It demonstrates that data augmentation can improve accuracy on many datasets, though no single technique universally dominates, underscoring the value of diverse, potentially pipeline-based augmentation strategies. The study provides a practical framework for applying augmentation to time series and highlights directions for future research, including synergy among methods and domain-adaptive pipelines. Overall, the work advances understanding of how to leverage augmentation to improve robustness and generalization in time series classification under limited data.

Abstract

Our study investigates the impact of data augmentation on the performance of multivariate time series models, focusing on datasets from the UCR archive. Despite the limited size of these datasets, we achieved classification accuracy improvements in 10 out of 13 datasets using the Rocket and InceptionTime models. This highlights the essential role of sufficient data in training effective models, paralleling the advancements seen in computer vision. Our work delves into adapting and applying existing methods in innovative ways to the domain of multivariate time series classification. Our comprehensive exploration of these techniques sets a new standard for addressing data scarcity in time series analysis, emphasizing that diverse augmentation strategies are crucial for unlocking the potential of both traditional and deep learning models. Moreover, by meticulously analyzing and applying a variety of augmentation techniques, we demonstrate that strategic data enrichment can enhance model accuracy. This not only establishes a benchmark for future research in time series analysis but also underscores the importance of adopting varied augmentation approaches to improve model performance in the face of limited data availability.

Data Augmentation for Multivariate Time Series Classification: An Experimental Study

TL;DR

Abstract

Paper Structure (23 sections, 4 equations, 6 figures, 6 tables)

This paper contains 23 sections, 4 equations, 6 figures, 6 tables.

Introduction
A Taxonomy of Time Series Augmentation Techniques
Overview of Time Series Augmentation Techniques
Basic Techniques
Time Domain
Frequency Domain
Oversampling Techniques
Decomposition-Based Techniques
Generative Techniques Overview
Statistical Generative Models
Neural Networks Based Generative Models
Probabilistic Models
Structure- and Label-Preserving Techniques
Label-preserving
Structure-preserving
...and 8 more sections

Figures (6)

Figure 1: Comprehensive taxonomy of data augmentation techniques for time series analysis, integrating a wide array of methodologies from basic transformations to advanced generative models, including a branch on Preserving Techniques.
Figure 2: Basic Techniques, like noise injection
Figure 3: Oversampling Techniques, like SMOTE
Figure 4: Generative Techniques, like timeGANs
Figure 5: Label-Preserving Techniques, like range techniques
...and 1 more figures

Data Augmentation for Multivariate Time Series Classification: An Experimental Study

TL;DR

Abstract

Data Augmentation for Multivariate Time Series Classification: An Experimental Study

Authors

TL;DR

Abstract

Table of Contents

Figures (6)