Table of Contents
Fetching ...

Integrating Sequence and Image Modeling in Irregular Medical Time Series Through Self-Supervised Learning

Liuqing Chen, Shuhong Xiao, Shixian Ding, Shanhai Hu, Lingyun Sun

TL;DR

This work tackles irregular and heavily missing medical time series by proposing a joint framework that fuses sequence-based imputation with image-based time-series representations. It uses a generator–discriminator sequence imputation, a pre-trained Swin Transformer for image features, and a joint projection to learn unified representations. Three self-supervised losses—inter-sequence contrast, sequence-image contrast with a margin, and clustering-based alignment—drive robust cross-modal integration. Across three real-world datasets and robustness tests, the method consistently outperforms seven SOTA baselines, demonstrating improved accuracy, precision, recall, F1, AUROC, and AUPRC, while producing clinically plausible imputations as evidenced by expert feedback.

Abstract

Medical time series are often irregular and face significant missingness, posing challenges for data analysis and clinical decision-making. Existing methods typically adopt a single modeling perspective, either treating series data as sequences or transforming them into image representations for further classification. In this paper, we propose a joint learning framework that incorporates both sequence and image representations. We also design three self-supervised learning strategies to facilitate the fusion of sequence and image representations, capturing a more generalizable joint representation. The results indicate that our approach outperforms seven other state-of-the-art models in three representative real-world clinical datasets. We further validate our approach by simulating two major types of real-world missingness through leave-sensors-out and leave-samples-out techniques. The results demonstrate that our approach is more robust and significantly surpasses other baselines in terms of classification performance.

Integrating Sequence and Image Modeling in Irregular Medical Time Series Through Self-Supervised Learning

TL;DR

This work tackles irregular and heavily missing medical time series by proposing a joint framework that fuses sequence-based imputation with image-based time-series representations. It uses a generator–discriminator sequence imputation, a pre-trained Swin Transformer for image features, and a joint projection to learn unified representations. Three self-supervised losses—inter-sequence contrast, sequence-image contrast with a margin, and clustering-based alignment—drive robust cross-modal integration. Across three real-world datasets and robustness tests, the method consistently outperforms seven SOTA baselines, demonstrating improved accuracy, precision, recall, F1, AUROC, and AUPRC, while producing clinically plausible imputations as evidenced by expert feedback.

Abstract

Medical time series are often irregular and face significant missingness, posing challenges for data analysis and clinical decision-making. Existing methods typically adopt a single modeling perspective, either treating series data as sequences or transforming them into image representations for further classification. In this paper, we propose a joint learning framework that incorporates both sequence and image representations. We also design three self-supervised learning strategies to facilitate the fusion of sequence and image representations, capturing a more generalizable joint representation. The results indicate that our approach outperforms seven other state-of-the-art models in three representative real-world clinical datasets. We further validate our approach by simulating two major types of real-world missingness through leave-sensors-out and leave-samples-out techniques. The results demonstrate that our approach is more robust and significantly surpasses other baselines in terms of classification performance.

Paper Structure

This paper contains 20 sections, 13 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: The framework of our approach.
  • Figure 2: Performance under increased missingness: (a) leave-sensors-out and (b) leave-samples-out on the PAM dataset. Tests are conducted with 10%-50% extra missing values.