Table of Contents
Fetching ...

MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis

Nimeesha Chan, Felix Parker, William Bennett, Tianyi Wu, Mung Yao Jia, James Fackler, Kimia Ghobadi

TL;DR

MedTsLLM presents a novel framework that integrates high-frequency medical time-series with unstructured text via a patch reprogramming layer, enabling semantic segmentation, boundary detection, and anomaly detection. By encoding time-series patches with full-precision tokens, aligning them to a frozen pretrained LLM, and incorporating covariates through multiple strategies, the model achieves state-of-the-art results across ECG and ventilator waveform datasets. Key contributions include four covariate integration methods, patient-specific prompting, and robust task solvers that directly map LLM outputs to time-series predictions. The approach demonstrates strong generalization across domains and holds promise for clinical decision support, real-time monitoring, and enhanced utilization of multimodal patient data, while acknowledging computational demands and areas for interpretability enhancements.

Abstract

The complexity and heterogeneity of data in many real-world applications pose significant challenges for traditional machine learning and signal processing techniques. For instance, in medicine, effective analysis of diverse physiological signals is crucial for patient monitoring and clinical decision-making and yet highly challenging. We introduce MedTsLLM, a general multimodal large language model (LLM) framework that effectively integrates time series data and rich contextual information in the form of text to analyze physiological signals, performing three tasks with clinical relevance: semantic segmentation, boundary detection, and anomaly detection in time series. These critical tasks enable deeper analysis of physiological signals and can provide actionable insights for clinicians. We utilize a reprogramming layer to align embeddings of time series patches with a pretrained LLM's embedding space and make effective use of raw time series, in conjunction with textual context. Given the multivariate nature of medical datasets, we develop methods to handle multiple covariates. We additionally tailor the text prompt to include patient-specific information. Our model outperforms state-of-the-art baselines, including deep learning models, other LLMs, and clinical methods across multiple medical domains, specifically electrocardiograms and respiratory waveforms. MedTsLLM presents a promising step towards harnessing the power of LLMs for medical time series analysis that can elevate data-driven tools for clinicians and improve patient outcomes.

MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis

TL;DR

MedTsLLM presents a novel framework that integrates high-frequency medical time-series with unstructured text via a patch reprogramming layer, enabling semantic segmentation, boundary detection, and anomaly detection. By encoding time-series patches with full-precision tokens, aligning them to a frozen pretrained LLM, and incorporating covariates through multiple strategies, the model achieves state-of-the-art results across ECG and ventilator waveform datasets. Key contributions include four covariate integration methods, patient-specific prompting, and robust task solvers that directly map LLM outputs to time-series predictions. The approach demonstrates strong generalization across domains and holds promise for clinical decision support, real-time monitoring, and enhanced utilization of multimodal patient data, while acknowledging computational demands and areas for interpretability enhancements.

Abstract

The complexity and heterogeneity of data in many real-world applications pose significant challenges for traditional machine learning and signal processing techniques. For instance, in medicine, effective analysis of diverse physiological signals is crucial for patient monitoring and clinical decision-making and yet highly challenging. We introduce MedTsLLM, a general multimodal large language model (LLM) framework that effectively integrates time series data and rich contextual information in the form of text to analyze physiological signals, performing three tasks with clinical relevance: semantic segmentation, boundary detection, and anomaly detection in time series. These critical tasks enable deeper analysis of physiological signals and can provide actionable insights for clinicians. We utilize a reprogramming layer to align embeddings of time series patches with a pretrained LLM's embedding space and make effective use of raw time series, in conjunction with textual context. Given the multivariate nature of medical datasets, we develop methods to handle multiple covariates. We additionally tailor the text prompt to include patient-specific information. Our model outperforms state-of-the-art baselines, including deep learning models, other LLMs, and clinical methods across multiple medical domains, specifically electrocardiograms and respiratory waveforms. MedTsLLM presents a promising step towards harnessing the power of LLMs for medical time series analysis that can elevate data-driven tools for clinicians and improve patient outcomes.
Paper Structure (38 sections, 2 figures, 15 tables)

This paper contains 38 sections, 2 figures, 15 tables.

Figures (2)

  • Figure 1: In our proposed framework, the multimodal input consists of contextual input and raw time series data, which are both converted to embeddings. The concatenated embeddings are fed into a pretrained LLM. Output LLM embeddings are then used by task-specific methods to generate predictions. Our contributions are highlighted with red dotted lines.
  • Figure 2: A schematic overview of our proposed methodology. Time series and corresponding textual context are embedded separately and concatenated. While text passes through standard tokenization and embedding, time series is patched and transformed into embeddings, which are aligned to text embeddings using a patch reprogrammer layer. Covariate information is merged using one of our proposed strategies, and embeddings are fed into the LLM, and through a projection layer to produce raw predictions. These predictions are transformed to solve one of our selected analysis tasks using task-specific layers and processing. The covariate strategies (green) and task solvers (pink) are our primary methodological contributions. All plot examples in the task solvers section are real outputs of our model on the datasets used in this work.