Table of Contents
Fetching ...

TANDEM: Temporal Attention-guided Neural Differential Equations for Missingness in Time Series Classification

YongKyung Oh, Dong-Young Lim, Sungil Kim, Alex Bui

Abstract

Handling missing data in time series classification remains a significant challenge in various domains. Traditional methods often rely on imputation, which may introduce bias or fail to capture the underlying temporal dynamics. In this paper, we propose TANDEM (Temporal Attention-guided Neural Differential Equations for Missingness), an attention-guided neural differential equation framework that effectively classifies time series data with missing values. Our approach integrates raw observation, interpolated control path, and continuous latent dynamics through a novel attention mechanism, allowing the model to focus on the most informative aspects of the data. We evaluate TANDEM on 30 benchmark datasets and a real-world medical dataset, demonstrating its superiority over existing state-of-the-art methods. Our framework not only improves classification accuracy but also provides insights into the handling of missing data, making it a valuable tool in practice.

TANDEM: Temporal Attention-guided Neural Differential Equations for Missingness in Time Series Classification

Abstract

Handling missing data in time series classification remains a significant challenge in various domains. Traditional methods often rely on imputation, which may introduce bias or fail to capture the underlying temporal dynamics. In this paper, we propose TANDEM (Temporal Attention-guided Neural Differential Equations for Missingness), an attention-guided neural differential equation framework that effectively classifies time series data with missing values. Our approach integrates raw observation, interpolated control path, and continuous latent dynamics through a novel attention mechanism, allowing the model to focus on the most informative aspects of the data. We evaluate TANDEM on 30 benchmark datasets and a real-world medical dataset, demonstrating its superiority over existing state-of-the-art methods. Our framework not only improves classification accuracy but also provides insights into the handling of missing data, making it a valuable tool in practice.

Paper Structure

This paper contains 29 sections, 8 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: Conceptual overview of the TANDEM framework. For a given time series with potentially missing values, three distinct feature streams are processed: (i) the raw observation $\tilde{{\bm{x}}}(t)$, (ii) an interpolated, piecewise-smooth control path $\bm{X}(t)$, and (iii) continuous latent dynamics ${\bm{z}}(t)$ derived from an NDE backbone. Each stream is individually refined by attention mechanisms, resulting in attended representations $\Phi_{\tilde{{\bm{x}}}}(t)$, $\Phi_{\bm{X}}(t)$, and $\Phi_{{\bm{z}}}(t)$. Colors represent the learned temporal attention scores for each stream. These attended representations are then adaptively weighted by learnable Gumbel-Sigmoid gates ($\sigma_{\tilde{{\bm{x}}}}$, $\sigma_{\bm{X}}$, $\sigma_{{\bm{z}}}$) which determine the contribution of each stream. The resulting fused representation, $\bar{\mathcal{Z}}(t)$, is subsequently passed to a classifier.
  • Figure 2: Performance-computation trade-off analysis with NDE baselines and their TANDEM variants. Each point reflects a model with different layer counts ($n_l$) and hidden sizes ($n_h$). Dashed/Solid lines show second-order polynomial trends for each model family. Each configuration is repeated five times.
  • Figure 3: Ablation study regarding model configuration