Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
Ching Chang, Jeehyun Hwang, Yidan Shi, Haixin Wang, Wen-Chih Peng, Tien-Fu Chen, Wei Wang
TL;DR
Time-IMM introduces a cause-driven irregularity taxonomy for multimodal multivariate time series and provides a nine-dataset benchmark that preserves asynchronous textual information. The IMM-TSF library enables plug-and-play forecasting with modular text encoders and fusion mechanisms, including timestamp-to-text fusion and multimodal fusion. Empirical results show consistent forecasting improvements when leveraging textual context, with multimodal models delivering notable gains across irregularity types and robust temporal generalization under distributional shifts. The work offers a practical, extensible platform for evaluating and advancing time series analysis under real-world irregular and multimodal conditions.
Abstract
Time series data in real-world applications such as healthcare, climate modeling, and finance are often irregular, multimodal, and messy, with varying sampling rates, asynchronous modalities, and pervasive missingness. However, existing benchmarks typically assume clean, regularly sampled, unimodal data, creating a significant gap between research and real-world deployment. We introduce Time-IMM, a dataset specifically designed to capture cause-driven irregularity in multimodal multivariate time series. Time-IMM represents nine distinct types of time series irregularity, categorized into trigger-based, constraint-based, and artifact-based mechanisms. Complementing the dataset, we introduce IMM-TSF, a benchmark library for forecasting on irregular multimodal time series, enabling asynchronous integration and realistic evaluation. IMM-TSF includes specialized fusion modules, including a timestamp-to-text fusion module and a multimodality fusion module, which support both recency-aware averaging and attention-based integration strategies. Empirical results demonstrate that explicitly modeling multimodality on irregular time series data leads to substantial gains in forecasting performance. Time-IMM and IMM-TSF provide a foundation for advancing time series analysis under real-world conditions. The dataset is publicly available at https://github.com/blacksnail789521/Time-IMM, and the benchmark library can be accessed at https://github.com/blacksnail789521/IMM-TSF. Project page: https://blacksnail789521.github.io/time-imm-project-page/
