Dataset Condensation for Time Series Classification via Dual Domain Matching

Zhanyu Liu; Ke Hao; Guanjie Zheng; Yanwei Yu

Dataset Condensation for Time Series Classification via Dual Domain Matching

Zhanyu Liu, Ke Hao, Guanjie Zheng, Yanwei Yu

TL;DR

This paper proposes a novel framework named Dataset Condensation for Time Series Classification via Dual Domain Matching (CondTSC) which focuses on the time series classification dataset condensation task and aims to generate a condensed dataset that matches the surrogate objectives in both the time and frequency domains.

Abstract

Time series data has been demonstrated to be crucial in various research fields. The management of large quantities of time series data presents challenges in terms of deep learning tasks, particularly for training a deep neural network. Recently, a technique named \textit{Dataset Condensation} has emerged as a solution to this problem. This technique generates a smaller synthetic dataset that has comparable performance to the full real dataset in downstream tasks such as classification. However, previous methods are primarily designed for image and graph datasets, and directly adapting them to the time series dataset leads to suboptimal performance due to their inability to effectively leverage the rich information inherent in time series data, particularly in the frequency domain. In this paper, we propose a novel framework named Dataset \textit{\textbf{Cond}}ensation for \textit{\textbf{T}}ime \textit{\textbf{S}}eries \textit{\textbf{C}}lassification via Dual Domain Matching (\textbf{CondTSC}) which focuses on the time series classification dataset condensation task. Different from previous methods, our proposed framework aims to generate a condensed dataset that matches the surrogate objectives in both the time and frequency domains. Specifically, CondTSC incorporates multi-view data augmentation, dual domain training, and dual surrogate objectives to enhance the dataset condensation process in the time and frequency domains. Through extensive experiments, we demonstrate the effectiveness of our proposed framework, which outperforms other baselines and learns a condensed synthetic dataset that exhibits desirable characteristics such as conforming to the distribution of the original data.

Dataset Condensation for Time Series Classification via Dual Domain Matching

TL;DR

Abstract

Paper Structure (27 sections, 19 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 27 sections, 19 equations, 7 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Time Series Compression
Dataset Condensation
Frequency-enhanced Time series analysis
Preliminary
Problem Overview
Method
Initializing $\mathcal{S}$
Multi-view Data Augmentation
Dual Domain Training
Dual Domain Surrogate Objective Matching
Experiment
Datasets
Experiment Setting
...and 12 more sections

Figures (7)

Figure 1: The diagram illustrates the concept of time series data condensation, which aims to learn a small dataset that achieves comparable performance to the full dataset.
Figure 2: The diagram of CondTSC. LPF indicates low pass filter. FTPP indicates Fourier transform phase perturbation and FTMP indicates Fourier transform magnitude perturbation.
Figure 3: The TSNE visualization of the Insect dataset in both the time domain and the frequency domain. This demonstrates the intuition to do augmentation and utilize the dual-domain information for the time series data. The data in the frequency domain shows better decision boundaries and the data augmentations squeeze the boundaries between different classes.
Figure 4: The accuracy(%) with larger condensed data size.
Figure 5: the frequency domain and time domain visualization on Insect dataset of the learning process of synthetic data trained by CondTSC and MTT separately. We could observe that the synthetic data trained by CondTSC conforms to the distribution of the real data and consequently achieves remarkable performance.
...and 2 more figures

Dataset Condensation for Time Series Classification via Dual Domain Matching

TL;DR

Abstract

Dataset Condensation for Time Series Classification via Dual Domain Matching

Authors

TL;DR

Abstract

Table of Contents

Figures (7)