Table of Contents
Fetching ...

Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series

Ching Chang, Jeehyun Hwang, Yidan Shi, Haixin Wang, Wen-Chih Peng, Tien-Fu Chen, Wei Wang

TL;DR

Time-IMM introduces a cause-driven irregularity taxonomy for multimodal multivariate time series and provides a nine-dataset benchmark that preserves asynchronous textual information. The IMM-TSF library enables plug-and-play forecasting with modular text encoders and fusion mechanisms, including timestamp-to-text fusion and multimodal fusion. Empirical results show consistent forecasting improvements when leveraging textual context, with multimodal models delivering notable gains across irregularity types and robust temporal generalization under distributional shifts. The work offers a practical, extensible platform for evaluating and advancing time series analysis under real-world irregular and multimodal conditions.

Abstract

Time series data in real-world applications such as healthcare, climate modeling, and finance are often irregular, multimodal, and messy, with varying sampling rates, asynchronous modalities, and pervasive missingness. However, existing benchmarks typically assume clean, regularly sampled, unimodal data, creating a significant gap between research and real-world deployment. We introduce Time-IMM, a dataset specifically designed to capture cause-driven irregularity in multimodal multivariate time series. Time-IMM represents nine distinct types of time series irregularity, categorized into trigger-based, constraint-based, and artifact-based mechanisms. Complementing the dataset, we introduce IMM-TSF, a benchmark library for forecasting on irregular multimodal time series, enabling asynchronous integration and realistic evaluation. IMM-TSF includes specialized fusion modules, including a timestamp-to-text fusion module and a multimodality fusion module, which support both recency-aware averaging and attention-based integration strategies. Empirical results demonstrate that explicitly modeling multimodality on irregular time series data leads to substantial gains in forecasting performance. Time-IMM and IMM-TSF provide a foundation for advancing time series analysis under real-world conditions. The dataset is publicly available at https://github.com/blacksnail789521/Time-IMM, and the benchmark library can be accessed at https://github.com/blacksnail789521/IMM-TSF. Project page: https://blacksnail789521.github.io/time-imm-project-page/

Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series

TL;DR

Time-IMM introduces a cause-driven irregularity taxonomy for multimodal multivariate time series and provides a nine-dataset benchmark that preserves asynchronous textual information. The IMM-TSF library enables plug-and-play forecasting with modular text encoders and fusion mechanisms, including timestamp-to-text fusion and multimodal fusion. Empirical results show consistent forecasting improvements when leveraging textual context, with multimodal models delivering notable gains across irregularity types and robust temporal generalization under distributional shifts. The work offers a practical, extensible platform for evaluating and advancing time series analysis under real-world irregular and multimodal conditions.

Abstract

Time series data in real-world applications such as healthcare, climate modeling, and finance are often irregular, multimodal, and messy, with varying sampling rates, asynchronous modalities, and pervasive missingness. However, existing benchmarks typically assume clean, regularly sampled, unimodal data, creating a significant gap between research and real-world deployment. We introduce Time-IMM, a dataset specifically designed to capture cause-driven irregularity in multimodal multivariate time series. Time-IMM represents nine distinct types of time series irregularity, categorized into trigger-based, constraint-based, and artifact-based mechanisms. Complementing the dataset, we introduce IMM-TSF, a benchmark library for forecasting on irregular multimodal time series, enabling asynchronous integration and realistic evaluation. IMM-TSF includes specialized fusion modules, including a timestamp-to-text fusion module and a multimodality fusion module, which support both recency-aware averaging and attention-based integration strategies. Empirical results demonstrate that explicitly modeling multimodality on irregular time series data leads to substantial gains in forecasting performance. Time-IMM and IMM-TSF provide a foundation for advancing time series analysis under real-world conditions. The dataset is publicly available at https://github.com/blacksnail789521/Time-IMM, and the benchmark library can be accessed at https://github.com/blacksnail789521/IMM-TSF. Project page: https://blacksnail789521.github.io/time-imm-project-page/

Paper Structure

This paper contains 130 sections, 12 equations, 10 figures, 16 tables.

Figures (10)

  • Figure 1: Overview of the taxonomy of irregularities in time series data. Each type is exemplified by a corresponding dataset curated as part of the Time-IMM dataset.
  • Figure 2: Visualization of the Time-IMM datasets, annotated to show sampling patterns linked to the taxonomy, such as adaptive bursts (RepoHealth), trading-hour gaps (FNSPID), and sensor misalignment (EPA-Air).
  • Figure 3: Problem formulation for irregular multimodal multivariate time series forecasting.
  • Figure 4: IMM-TSF architecture. The library includes modular fusion layers that combine irregular numerical sequences with asynchronous text via timestamp-to-text fusion and multimodal forecast fusion.
  • Figure 5: Radar chart comparing forecasting performance across all baseline models. Shaded regions highlight the relative improvement from multimodal over unimodal variants.
  • ...and 5 more figures