Table of Contents
Fetching ...

TimeSeriesScientist: A General-Purpose AI Agent for Time Series Analysis

Haokun Zhao, Xiang Zhang, Jiaqi Wei, Yiwei Xu, Yuting He, Siqi Sun, Chenyu You

TL;DR

TimeSeriesScientist (TSci) introduces a general-purpose, LLM-driven agentic framework for univariate time series forecasting that automates preprocessing, model selection, validation, and reporting. It uses four specialized agents—Curator, Planner, Forecaster, and Reporter—to create a transparent, extensible forecasting workflow that mimics a human data scientist. Empirical results across eight benchmarks show TSci outperforms both statistical and LLM baselines, with average improvements in forecast error of 10.4% and 38.2% respectively, and generates rigorous, interpretable reports. This approach advances practical forecasting by delivering a domain-agnostic, white-box pipeline that couples automated reasoning with auditable documentation, enabling robust deployment and auditability in real-world settings.

Abstract

Time series forecasting is central to decision-making in domains as diverse as energy, finance, climate, and public health. In practice, forecasters face thousands of short, noisy series that vary in frequency, quality, and horizon, where the dominant cost lies not in model fitting, but in the labor-intensive preprocessing, validation, and ensembling required to obtain reliable predictions. Prevailing statistical and deep learning models are tailored to specific datasets or domains and generalize poorly. A general, domain-agnostic framework that minimizes human intervention is urgently in demand. In this paper, we introduce TimeSeriesScientist (TSci), the first LLM-driven agentic framework for general time series forecasting. The framework comprises four specialized agents: Curator performs LLM-guided diagnostics augmented by external tools that reason over data statistics to choose targeted preprocessing; Planner narrows the hypothesis space of model choice by leveraging multi-modal diagnostics and self-planning over the input; Forecaster performs model fitting and validation and, based on the results, adaptively selects the best model configuration as well as ensemble strategy to make final predictions; and Reporter synthesizes the whole process into a comprehensive, transparent report. With transparent natural-language rationales and comprehensive reports, TSci transforms the forecasting workflow into a white-box system that is both interpretable and extensible across tasks. Empirical results on eight established benchmarks demonstrate that TSci consistently outperforms both statistical and LLM-based baselines, reducing forecast error by an average of 10.4% and 38.2%, respectively. Moreover, TSci produces a clear and rigorous report that makes the forecasting workflow more transparent and interpretable.

TimeSeriesScientist: A General-Purpose AI Agent for Time Series Analysis

TL;DR

TimeSeriesScientist (TSci) introduces a general-purpose, LLM-driven agentic framework for univariate time series forecasting that automates preprocessing, model selection, validation, and reporting. It uses four specialized agents—Curator, Planner, Forecaster, and Reporter—to create a transparent, extensible forecasting workflow that mimics a human data scientist. Empirical results across eight benchmarks show TSci outperforms both statistical and LLM baselines, with average improvements in forecast error of 10.4% and 38.2% respectively, and generates rigorous, interpretable reports. This approach advances practical forecasting by delivering a domain-agnostic, white-box pipeline that couples automated reasoning with auditable documentation, enabling robust deployment and auditability in real-world settings.

Abstract

Time series forecasting is central to decision-making in domains as diverse as energy, finance, climate, and public health. In practice, forecasters face thousands of short, noisy series that vary in frequency, quality, and horizon, where the dominant cost lies not in model fitting, but in the labor-intensive preprocessing, validation, and ensembling required to obtain reliable predictions. Prevailing statistical and deep learning models are tailored to specific datasets or domains and generalize poorly. A general, domain-agnostic framework that minimizes human intervention is urgently in demand. In this paper, we introduce TimeSeriesScientist (TSci), the first LLM-driven agentic framework for general time series forecasting. The framework comprises four specialized agents: Curator performs LLM-guided diagnostics augmented by external tools that reason over data statistics to choose targeted preprocessing; Planner narrows the hypothesis space of model choice by leveraging multi-modal diagnostics and self-planning over the input; Forecaster performs model fitting and validation and, based on the results, adaptively selects the best model configuration as well as ensemble strategy to make final predictions; and Reporter synthesizes the whole process into a comprehensive, transparent report. With transparent natural-language rationales and comprehensive reports, TSci transforms the forecasting workflow into a white-box system that is both interpretable and extensible across tasks. Empirical results on eight established benchmarks demonstrate that TSci consistently outperforms both statistical and LLM-based baselines, reducing forecast error by an average of 10.4% and 38.2%, respectively. Moreover, TSci produces a clear and rigorous report that makes the forecasting workflow more transparent and interpretable.

Paper Structure

This paper contains 47 sections, 23 equations, 14 figures, 6 tables, 1 algorithm.

Figures (14)

  • Figure 1: Performance comparison of TSci with five LLM-based baselines. TSci outperforms LLM-based baselines on eight benchmarks spanning five domains (Figure \ref{['mae_radar']}). The comprehensive report generated by TSci outperforms LLM-based baselines across five rubrics (Figure \ref{['winrate_radar']}).
  • Figure 2: Overview of our proposed TSci framework. This collaborative multi-agent system is designed to analyze and forecast general time series data, just like a human scientist. Upon receiving input time series data, the framework executes a structured four-agent workflow. Curator generates analytical reports (Section \ref{['curator']}), Planner selects model configurations through reasoning and validation (Section \ref{['planner']}), Forecaster integrates model results to produce the final forecast (Section \ref{['forecaster']}), Reporter generates a comprehensive report as the final output of our framework (Section \ref{['report']}).
  • Figure 3: Workflow of Curator. The raw dataset $\mathcal{D}$ is first diagnosed and processed into a cleaned dataset $\tilde{\mathcal{D}}$. Next, the agent generates tailored visualizations $V$ to expose temporal structures and facilitate interpretability. Finally, the agent integrates the processed data and visualizations to extract trends, seasonality, and stationarity, producing a comprehensive analysis summary $S$.
  • Figure 4: Demonstration of the output comprehensive report $\mathcal{R}$. The report consists of five parts, consolidating results, diagnostics, interpretations, and decision provenance into a transparent output.
  • Figure 5: Performance comparison of TSci with five LLM-based baselines across eight datasets.
  • ...and 9 more figures