Table of Contents
Fetching ...

EMIT- Event-Based Masked Auto Encoding for Irregular Time Series

Hrishikesh Patel, Ruihong Qiu, Adam Irwin, Shazia Sadiq, Sen Wang

TL;DR

A novel pretraining framework, EMIT, an event-based masking for irregular time series focusing on masking-based reconstruction in the latent space, which preserves the natural variability and timing of measurements while enhancing the model's ability to process irregular intervals without losing essential information.

Abstract

Irregular time series, where data points are recorded at uneven intervals, are prevalent in healthcare settings, such as emergency wards where vital signs and laboratory results are captured at varying times. This variability, which reflects critical fluctuations in patient health, is essential for informed clinical decision-making. Existing self-supervised learning research on irregular time series often relies on generic pretext tasks like forecasting, which may not fully utilise the signal provided by irregular time series. There is a significant need for specialised pretext tasks designed for the characteristics of irregular time series to enhance model performance and robustness, especially in scenarios with limited data availability. This paper proposes a novel pretraining framework, EMIT, an event-based masking for irregular time series. EMIT focuses on masking-based reconstruction in the latent space, selecting masking points based on the rate of change in the data. This method preserves the natural variability and timing of measurements while enhancing the model's ability to process irregular intervals without losing essential information. Extensive experiments on the MIMIC-III and PhysioNet Challenge datasets demonstrate the superior performance of our event-based masking strategy. The code has been released at https://github.com/hrishi-ds/EMIT.

EMIT- Event-Based Masked Auto Encoding for Irregular Time Series

TL;DR

A novel pretraining framework, EMIT, an event-based masking for irregular time series focusing on masking-based reconstruction in the latent space, which preserves the natural variability and timing of measurements while enhancing the model's ability to process irregular intervals without losing essential information.

Abstract

Irregular time series, where data points are recorded at uneven intervals, are prevalent in healthcare settings, such as emergency wards where vital signs and laboratory results are captured at varying times. This variability, which reflects critical fluctuations in patient health, is essential for informed clinical decision-making. Existing self-supervised learning research on irregular time series often relies on generic pretext tasks like forecasting, which may not fully utilise the signal provided by irregular time series. There is a significant need for specialised pretext tasks designed for the characteristics of irregular time series to enhance model performance and robustness, especially in scenarios with limited data availability. This paper proposes a novel pretraining framework, EMIT, an event-based masking for irregular time series. EMIT focuses on masking-based reconstruction in the latent space, selecting masking points based on the rate of change in the data. This method preserves the natural variability and timing of measurements while enhancing the model's ability to process irregular intervals without losing essential information. Extensive experiments on the MIMIC-III and PhysioNet Challenge datasets demonstrate the superior performance of our event-based masking strategy. The code has been released at https://github.com/hrishi-ds/EMIT.
Paper Structure (29 sections, 17 equations, 7 figures, 4 tables, 2 algorithms)

This paper contains 29 sections, 17 equations, 7 figures, 4 tables, 2 algorithms.

Figures (7)

  • Figure 1: The plots display the measurements of heart rate, oxygen saturation, and platelet count for a patient in the ICU during the first 24 hours after admission. The data points are recorded at inconsistent intervals and the data collection is asynchronous among the three clinical variables.
  • Figure 2: An illustration of significant and insignificant events in the context of irregular time series. Points exhibiting a large rate of change are highlighted in green and are considered significant events. Conversely, points highlighted in orange have a relatively low rate of change and are considered insignificant events. Our model, EMIT, prioritizes masking and reconstruction of points associated with significant events, focusing on green regions that exhibit a large rate of change.
  • Figure 3: EMIT pretraining architecture. The initial input triplets are embedded and subsequently masked with the respective masking token. The masks are selected based on events identified by their rate of change as described in Algorithm \ref{['alg:ev-mask']} and \ref{['alg:rate']}. The embeddings are then summed to produce the final triplet embedding, which is fed into the transformer encoder blocks. The transformer attempts to reconstruct the masked embeddings using the remaining unmasked embeddings. The generated embeddings are then used for loss calculation.
  • Figure 4: Mortality prediction performance of different baseline models on mimic dataset trained at different percentage of labelled data.
  • Figure 5: Mortality prediction performance of different baseline models on PhysioNet Challenge 2012 dataset trained at different percentage of labelled data.
  • ...and 2 more figures