Table of Contents
Fetching ...

Domain-Oriented Time Series Inference Agents for Reasoning and Automated Analysis

Wen Ye, Wei Yang, Defu Cao, Yizhou Zhang, Lumingyuan Tang, Jie Cai, Yan Liu

TL;DR

The paper tackles the gap in real-world time series analysis where multi-step reasoning, constraint handling, and domain knowledge are essential. It introduces TS-Reasoner, a domain-oriented time series agent that decomposes natural-language instructions into structured operator workflows and executes them with domain-specific tools, aided by a self-refinement feedback loop. Through a dual evaluation—basic time-series understanding and complex multi-step inference on a new benchmark—the approach consistently outperforms general-purpose LLMs, demonstrating improved success rates and lower error metrics. The work highlights the importance of combining reasoning with grounded computation and domain specialization to enable robust, interpretable time series analysis in applications like energy, healthcare, and finance.

Abstract

Real-world time series inference requires more than point forecasting. It demands multi-step reasoning, constraint handling, domain knowledge incorporation, and domain-specific workflow assembly. Existing time series foundation models are limited to narrow tasks and lack flexibility to generalize across diverse scenarios. On the other hand, large language models (LLMs) struggle with numerical precision. To address these limitations, we introduce TS-Reasoner, a Domain-Oriented Time Series Agent that integrates natural language reasoning with precise numerical execution. TS-Reasoner decomposes natural language instructions into structured workflows composed of statistical, logical, and domain-specific operators, and incorporates a self-refinement mechanism for adaptive execution. We evaluate its capabilities through two axes: basic time series understanding and complex multi-step inference, using the TimeSeriesExam benchmark and a newly constructed dataset. Experimental results show that TS-Reasoner significantly outperforms general-purpose LLMs, highlighting the promise of domain-specialized agents for robust and interpretable time series reasoning.

Domain-Oriented Time Series Inference Agents for Reasoning and Automated Analysis

TL;DR

The paper tackles the gap in real-world time series analysis where multi-step reasoning, constraint handling, and domain knowledge are essential. It introduces TS-Reasoner, a domain-oriented time series agent that decomposes natural-language instructions into structured operator workflows and executes them with domain-specific tools, aided by a self-refinement feedback loop. Through a dual evaluation—basic time-series understanding and complex multi-step inference on a new benchmark—the approach consistently outperforms general-purpose LLMs, demonstrating improved success rates and lower error metrics. The work highlights the importance of combining reasoning with grounded computation and domain specialization to enable robust, interpretable time series analysis in applications like energy, healthcare, and finance.

Abstract

Real-world time series inference requires more than point forecasting. It demands multi-step reasoning, constraint handling, domain knowledge incorporation, and domain-specific workflow assembly. Existing time series foundation models are limited to narrow tasks and lack flexibility to generalize across diverse scenarios. On the other hand, large language models (LLMs) struggle with numerical precision. To address these limitations, we introduce TS-Reasoner, a Domain-Oriented Time Series Agent that integrates natural language reasoning with precise numerical execution. TS-Reasoner decomposes natural language instructions into structured workflows composed of statistical, logical, and domain-specific operators, and incorporates a self-refinement mechanism for adaptive execution. We evaluate its capabilities through two axes: basic time series understanding and complex multi-step inference, using the TimeSeriesExam benchmark and a newly constructed dataset. Experimental results show that TS-Reasoner significantly outperforms general-purpose LLMs, highlighting the promise of domain-specialized agents for robust and interpretable time series reasoning.
Paper Structure (41 sections, 5 equations, 14 figures, 5 tables)

This paper contains 41 sections, 5 equations, 14 figures, 5 tables.

Figures (14)

  • Figure 1: A time series of daily search frequency for the keyword "reasoning".
  • Figure 2: The pipeline of TS-Reasoner. The LLM work as task decomposer, which learn from operator definitions and in-context examples to decompose task instances into sequence of operators. Then a program executor is responsible for solution plan execution and forms a feedback loop with task decomposer.
  • Figure 3: Strict Accuracy of TS-Reasoner and general purpose LLMs on the TimeSeriesExam.
  • Figure 4: Performance on Multi-Step Diagnostic Tasks. A small jittering noise of 0.01 is added to success rate to distinguish overlapping points. TSR w/ PQ denotes TS-Reasoner when prompted with paraphrased questions. LLM models are equipped with CodeAct.
  • Figure 5: Error distribution of different approaches on electricity prediction task without covariates.
  • ...and 9 more figures