Table of Contents
Fetching ...

An End-to-End Model for Time Series Classification In the Presence of Missing Values

Pengshuai Yao, Mengna Liu, Xu Cheng, Fan Shi, Huan Li, Xiufeng Liu, Shengyong Chen

TL;DR

This work tackles incomplete time series classification by proposing an end-to-end model that unifies data imputation and representation learning. It introduces a temporal imputation module based on GRU, a multi-scale dilated CNN feature learner, and a joint learning objective that optimizes both imputation and classification losses, treating imputation output as potentially noisy yet informative for discrimination. Across extensive benchmarks—68 UCR univariate datasets, a PAM multivariate dataset, and four real-world datasets with missing values—the approach consistently outperforms state-of-the-art ITSC methods, with the largest gains at high missing rates. The findings highlight that prioritizing classification performance while enabling label-guided imputation and robust multi-scale feature extraction yields practical, scalable improvements for real-world ITSC tasks.

Abstract

Time series classification with missing data is a prevalent issue in time series analysis, as temporal data often contain missing values in practical applications. The traditional two-stage approach, which handles imputation and classification separately, can result in sub-optimal performance as label information is not utilized in the imputation process. On the other hand, a one-stage approach can learn features under missing information, but feature representation is limited as imputed errors are propagated in the classification process. To overcome these challenges, this study proposes an end-to-end neural network that unifies data imputation and representation learning within a single framework, allowing the imputation process to take advantage of label information. Differing from previous methods, our approach places less emphasis on the accuracy of imputation data and instead prioritizes classification performance. A specifically designed multi-scale feature learning module is implemented to extract useful information from the noise-imputation data. The proposed model is evaluated on 68 univariate time series datasets from the UCR archive, as well as a multivariate time series dataset with various missing data ratios and 4 real-world datasets with missing information. The results indicate that the proposed model outperforms state-of-the-art approaches for incomplete time series classification, particularly in scenarios with high levels of missing data.

An End-to-End Model for Time Series Classification In the Presence of Missing Values

TL;DR

This work tackles incomplete time series classification by proposing an end-to-end model that unifies data imputation and representation learning. It introduces a temporal imputation module based on GRU, a multi-scale dilated CNN feature learner, and a joint learning objective that optimizes both imputation and classification losses, treating imputation output as potentially noisy yet informative for discrimination. Across extensive benchmarks—68 UCR univariate datasets, a PAM multivariate dataset, and four real-world datasets with missing values—the approach consistently outperforms state-of-the-art ITSC methods, with the largest gains at high missing rates. The findings highlight that prioritizing classification performance while enabling label-guided imputation and robust multi-scale feature extraction yields practical, scalable improvements for real-world ITSC tasks.

Abstract

Time series classification with missing data is a prevalent issue in time series analysis, as temporal data often contain missing values in practical applications. The traditional two-stage approach, which handles imputation and classification separately, can result in sub-optimal performance as label information is not utilized in the imputation process. On the other hand, a one-stage approach can learn features under missing information, but feature representation is limited as imputed errors are propagated in the classification process. To overcome these challenges, this study proposes an end-to-end neural network that unifies data imputation and representation learning within a single framework, allowing the imputation process to take advantage of label information. Differing from previous methods, our approach places less emphasis on the accuracy of imputation data and instead prioritizes classification performance. A specifically designed multi-scale feature learning module is implemented to extract useful information from the noise-imputation data. The proposed model is evaluated on 68 univariate time series datasets from the UCR archive, as well as a multivariate time series dataset with various missing data ratios and 4 real-world datasets with missing information. The results indicate that the proposed model outperforms state-of-the-art approaches for incomplete time series classification, particularly in scenarios with high levels of missing data.
Paper Structure (19 sections, 13 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 13 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: (a) two-stage method and (b) one-stage method, that utilizes features from the imputation network for classification, while our proposed method in (c) considers the output of the imputation model to be noisy, thus unifies data imputation and feature learning within the same framework.
  • Figure 2: Imputation results of BRITS in different missing ratios.
  • Figure 3: Our ITSC network framework, taking a univariate sample as an example, the input is ITS, passing through a temporal imputation module to impute missing data, using a GRU structure combined with observable values. For the noisy input after imputation, we use a multi-scale feature learning module, which includes N layers of multi-scale 1D CNNs, and finally, classification is performed and optimized through joint learning.
  • Figure 4: Illustration of temporal imputation when the first few steps contain missing values. It is evident that the estimated values $\tilde{{x}}_{2}$, $\tilde{{x}}_{3}$, and $\tilde{{x}}_{4}$ are incorrect. Furthermore, the learned hidden values ($\tilde{{h}}_{1}$ - $\tilde{{h}}_{3}$) can perpetuate the errors in the estimation of $\tilde{{x}}_{5}$.
  • Figure 5: The classification loss convergence during training.
  • ...and 4 more figures