Time-RA: Towards Time Series Reasoning for Anomaly Diagnosis with LLM Feedback
Yiyuan Yang, Zichuan Liu, Lei Song, Kai Ying, Zhiguang Wang, Tom Bamford, Svitlana Vyetrenko, Jiang Bian, Qingsong Wen
TL;DR
Time-RA reframes time series anomaly detection as a multimodal, reasoning-focused task, addressing the need for interpretable root-cause analysis. It introduces RATs40K, a real-world dataset that fuses numeric series, textual context, and visual plots, annotated with 14 univariate and 6 multivariate anomaly types and structured reasoning using an Observation–Thought–Action paradigm. The approach formalizes task inputs as $T$, $D$, and $V$ and outputs as $y_l$, $a$, and $r$, and demonstrates that fine-tuned LLMs with multimodal inputs provide superior diagnostic and explanatory performance, with notable cross-domain transferability. Overall, Time-RA offers a scalable framework for multimodal, interpretable time-series analysis and establishes benchmarks for future research in anomaly diagnosis and reasoning.
Abstract
Time series anomaly detection (TSAD) has traditionally focused on binary classification and often lacks the fine-grained categorization and explanatory reasoning required for transparent decision-making. To address these limitations, we propose Time-series Reasoning for Anomaly (Time-RA), a novel task that reformulates TSAD from a discriminative into a generative, reasoning-intensive paradigm. To facilitate this, we introduce RATs40K, the first real-world large-scale multimodal benchmark with ~40,000 samples across 10 domains, integrating raw time series, textual context, and visual plots with structured reasoning annotations. Extensive benchmarking shows that while supervised fine-tuning and visual representations boost diagnostic accuracy and reasoning consistency, performance varies across complex scenarios. Notably, fine-tuned models demonstrate strong "plug-and-play" transferability, outperforming traditional baselines on unseen real-world datasets. Our work establishes a foundation for interpretable, multimodal time series analysis. All code (https://github.com/yyysjz1997/Time-RA) and the RATs40K dataset (https://huggingface.co/datasets/Time-RA/RATs40K) are fully open-sourced to facilitate future research.
