Table of Contents
Fetching ...

No Imputation Needed: A Switch Approach to Irregularly Sampled Time Series

Rohit Agarwal, Aman Sinha, Ayan Vishwakarma, Xavier Coubez, Marianne Clausel, Mathieu Constant, Alexander Horsch, Dilip K. Prasad

TL;DR

The paper tackles irregularly sampled time series (ISTS) without resorting to imputation, introducing SLAN (Switch LSTM Aggregation Network) which assigns one LSTM per sensor and uses a switch layer to activate only observed sensors. It maintains global and local summary states and employs a time-aware decay via Time2Vec for each sensor, enabling effective ISTS modeling without data imputation. Across MIMIC-III and PhysioNet 2012 mortality tasks, SLAN consistently outperforms both imputation-based baselines (e.g., IP-Nets, GRU-D) and non-imputation models, with notable gains in AUPRC and AUROC. The work demonstrates SLAN’s robustness to increasing missingness, analyzes the importance of sensors beyond sampling rate, and discusses practical scalability and potential extensions to multi-modality and streaming settings.

Abstract

Modeling irregularly-sampled time series (ISTS) is challenging because of missing values. Most existing methods focus on handling ISTS by converting irregularly sampled data into regularly sampled data via imputation. These models assume an underlying missing mechanism, which may lead to unwanted bias and sub-optimal performance. We present SLAN (Switch LSTM Aggregate Network), which utilizes a group of LSTMs to model ISTS without imputation, eliminating the assumption of any underlying process. It dynamically adapts its architecture on the fly based on the measured sensors using switches. SLAN exploits the irregularity information to explicitly capture each sensor's local summary and maintains a global summary state throughout the observational period. We demonstrate the efficacy of SLAN on two public datasets, namely, MIMIC-III, and Physionet 2012.

No Imputation Needed: A Switch Approach to Irregularly Sampled Time Series

TL;DR

The paper tackles irregularly sampled time series (ISTS) without resorting to imputation, introducing SLAN (Switch LSTM Aggregation Network) which assigns one LSTM per sensor and uses a switch layer to activate only observed sensors. It maintains global and local summary states and employs a time-aware decay via Time2Vec for each sensor, enabling effective ISTS modeling without data imputation. Across MIMIC-III and PhysioNet 2012 mortality tasks, SLAN consistently outperforms both imputation-based baselines (e.g., IP-Nets, GRU-D) and non-imputation models, with notable gains in AUPRC and AUROC. The work demonstrates SLAN’s robustness to increasing missingness, analyzes the importance of sensors beyond sampling rate, and discusses practical scalability and potential extensions to multi-modality and streaming settings.

Abstract

Modeling irregularly-sampled time series (ISTS) is challenging because of missing values. Most existing methods focus on handling ISTS by converting irregularly sampled data into regularly sampled data via imputation. These models assume an underlying missing mechanism, which may lead to unwanted bias and sub-optimal performance. We present SLAN (Switch LSTM Aggregate Network), which utilizes a group of LSTMs to model ISTS without imputation, eliminating the assumption of any underlying process. It dynamically adapts its architecture on the fly based on the measured sensors using switches. SLAN exploits the irregularity information to explicitly capture each sensor's local summary and maintains a global summary state throughout the observational period. We demonstrate the efficacy of SLAN on two public datasets, namely, MIMIC-III, and Physionet 2012.
Paper Structure (52 sections, 11 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 52 sections, 11 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: AUPRC of SLAN vs IPNets on P-12 and M-3 datasets with a drop of 25%, 50%, and 75% observed data. The red arrows show the % increase in the AUPRC of SLAN compared to IPNets with % increased value mentioned in the red-colored number. The blue and orange colored numbers represent the AUPRC of SLAN and IPNets, respectively.
  • Figure 2: (a) A snapshot of multi-variate regularly sampled time series for $i^{th}$ instance. $m$ represents the index of the sensor. (b) A snapshot of multi-variate irregularly sampled time series (ISTS) for $i^{th}$ instance. (c) Problem representation of the ISTS with respect to one instance by omitting the subscript $i$. (Best viewed in color)
  • Figure 3: (a) SLAN Architecture. Here $x^m_j$ denotes the input at time $t_j$ of $m^\text{th}$ sensor. The closed circuit in the switch layer means that a particular switch is "on", otherwise it is "off". The X sign in red implies that there is no input or output to the corresponding LSTM block. (b) The inner working of an LSTM block is given here. (c) Notations are denoted in the legend.
  • Figure 4: SLAN on different percentages of training datasets. The average with 95% confidence interval of 3 runs is reported here.
  • Figure 5: Comparison of the ranking of clinical variables w.r.t. sampling rate and mean importance.
  • ...and 2 more figures