Table of Contents
Fetching ...

CAAP: Class-Dependent Automatic Data Augmentation Based On Adaptive Policies For Time Series

Tien-Yu Chang, Hao Dai, Vincent S. Tseng

TL;DR

The paper tackles class-dependent bias in Automatic Data Augmentation for time-series, with a focus on ECG data. It introduces CAAP, a framework that combines a Class Adaption Policy Network, Class-dependent Regulation, and Information Region Adaption to learn class-aware augmentation policies while preserving informative waveform regions. A new metric for measuring class-dependent bias is proposed, and CAAP demonstrates improved accuracy and macro recall across multiple ECG datasets (PTBXL, Chapman, CPSC) and other time-series domains (HAR, EEG) compared with strong baselines. The work offers a practical, reliable ADA approach for real-world time-series applications, addressing both fairness and performance concerns in data augmentation.

Abstract

Data Augmentation is a common technique used to enhance the performance of deep learning models by expanding the training dataset. Automatic Data Augmentation (ADA) methods are getting popular because of their capacity to generate policies for various datasets. However, existing ADA methods primarily focused on overall performance improvement, neglecting the problem of class-dependent bias that leads to performance reduction in specific classes. This bias poses significant challenges when deploying models in real-world applications. Furthermore, ADA for time series remains an underexplored domain, highlighting the need for advancements in this field. In particular, applying ADA techniques to vital signals like an electrocardiogram (ECG) is a compelling example due to its potential in medical domains such as heart disease diagnostics. We propose a novel deep learning-based approach called Class-dependent Automatic Adaptive Policies (CAAP) framework to overcome the notable class-dependent bias problem while maintaining the overall improvement in time-series data augmentation. Specifically, we utilize the policy network to generate effective sample-wise policies with balanced difficulty through class and feature information extraction. Second, we design the augmentation probability regulation method to minimize class-dependent bias. Third, we introduce the information region concepts into the ADA framework to preserve essential regions in the sample. Through a series of experiments on real-world ECG datasets, we demonstrate that CAAP outperforms representative methods in achieving lower class-dependent bias combined with superior overall performance. These results highlight the reliability of CAAP as a promising ADA method for time series modeling that fits for the demands of real-world applications.

CAAP: Class-Dependent Automatic Data Augmentation Based On Adaptive Policies For Time Series

TL;DR

The paper tackles class-dependent bias in Automatic Data Augmentation for time-series, with a focus on ECG data. It introduces CAAP, a framework that combines a Class Adaption Policy Network, Class-dependent Regulation, and Information Region Adaption to learn class-aware augmentation policies while preserving informative waveform regions. A new metric for measuring class-dependent bias is proposed, and CAAP demonstrates improved accuracy and macro recall across multiple ECG datasets (PTBXL, Chapman, CPSC) and other time-series domains (HAR, EEG) compared with strong baselines. The work offers a practical, reliable ADA approach for real-world time-series applications, addressing both fairness and performance concerns in data augmentation.

Abstract

Data Augmentation is a common technique used to enhance the performance of deep learning models by expanding the training dataset. Automatic Data Augmentation (ADA) methods are getting popular because of their capacity to generate policies for various datasets. However, existing ADA methods primarily focused on overall performance improvement, neglecting the problem of class-dependent bias that leads to performance reduction in specific classes. This bias poses significant challenges when deploying models in real-world applications. Furthermore, ADA for time series remains an underexplored domain, highlighting the need for advancements in this field. In particular, applying ADA techniques to vital signals like an electrocardiogram (ECG) is a compelling example due to its potential in medical domains such as heart disease diagnostics. We propose a novel deep learning-based approach called Class-dependent Automatic Adaptive Policies (CAAP) framework to overcome the notable class-dependent bias problem while maintaining the overall improvement in time-series data augmentation. Specifically, we utilize the policy network to generate effective sample-wise policies with balanced difficulty through class and feature information extraction. Second, we design the augmentation probability regulation method to minimize class-dependent bias. Third, we introduce the information region concepts into the ADA framework to preserve essential regions in the sample. Through a series of experiments on real-world ECG datasets, we demonstrate that CAAP outperforms representative methods in achieving lower class-dependent bias combined with superior overall performance. These results highlight the reliability of CAAP as a promising ADA method for time series modeling that fits for the demands of real-world applications.
Paper Structure (22 sections, 12 equations, 6 figures, 9 tables)

This paper contains 22 sections, 12 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: Class-dependent Bias in ECG Task. The right graph is the RAO/RAE ECG signal, the medium graph represents the augmented RAO/RAE ECG signal, and the right chart is the normal ECG signal. Also, the red, yellow, and green boxs are the P-wave parts of the RAO/RAE, augmented, and normal ECG signals. We use the scaling transformation to augment the ECG signal, which multiplies each time step of the original signal with factors from a normal distribution(mean=1; standard deviation=0.3).
  • Figure 2: CAAP Framework Overview. (Stage 1) Searching Phase: Search for augmentation policy and other parameters; (Stage 2) Training Phase: Using policy to train the classifier network with the Class-dependent Regulation module.
  • Figure 3: Diagram of Information Region Adaption Module.
  • Figure 4: Performance changes in different NoAug percentages. The x-axis is fixed no-augmentation percentage (%), orange/black/green lines in the right y-axis are sample-wise improvement(↑)/bias(↓)/gain(↑), and the blue bar in the left y-axis is accuracy(↑).
  • Figure 5: Class-wise recall changes in different NoAug percentages. The x-axis is the different class index (c0 to c18, remove class with 0% class-wise recall) in the PTBXL form. Bars in each class index are no-augmentation percentages from 0 (%) to 100 (%), and the y-axis is the recall of each class.
  • ...and 1 more figures