Table of Contents
Fetching ...

TimeCopilot

Azul Garza, Renée Rosillo

TL;DR

TimeCopilot addresses fragmentation in time-series forecasting by unifying multiple TSFMs under an LLM-driven, open-source agentic interface. It introduces an end-to-end pipeline where LLMs orchestrate feature analysis, model selection, cross-validation, and forecast generation, while offering natural-language explanations for decisions and forecasts. On the GIFT-Eval benchmark, a TimeCopilot MedianEnsemble of Chronos-2, TimesFM, and TiRex with isotonic regression achieves state-of-the-art probabilistic and point accuracy at a cost of about $24$ in GPU-distributed inference. The work lays a foundation for reproducible, explainable, and accessible agentic forecasting and outlines future work on Model Context Protocol integration, domain expansion, and hierarchical/multivariate forecasting.

Abstract

We introduce TimeCopilot, the first open-source agentic framework for forecasting that combines multiple Time Series Foundation Models (TSFMs) with Large Language Models (LLMs) through a single unified API. TimeCopilot automates the forecasting pipeline: feature analysis, model selection, cross-validation, and forecast generation, while providing natural language explanations and supporting direct queries about the future. The framework is LLM-agnostic, compatible with both commercial and open-source models, and supports ensembles across diverse forecasting families. Results on the large-scale GIFT-Eval benchmark show that TimeCopilot achieves state-of-the-art probabilistic forecasting performance at low cost. Our framework provides a practical foundation for reproducible, explainable, and accessible agentic forecasting systems.

TimeCopilot

TL;DR

TimeCopilot addresses fragmentation in time-series forecasting by unifying multiple TSFMs under an LLM-driven, open-source agentic interface. It introduces an end-to-end pipeline where LLMs orchestrate feature analysis, model selection, cross-validation, and forecast generation, while offering natural-language explanations for decisions and forecasts. On the GIFT-Eval benchmark, a TimeCopilot MedianEnsemble of Chronos-2, TimesFM, and TiRex with isotonic regression achieves state-of-the-art probabilistic and point accuracy at a cost of about in GPU-distributed inference. The work lays a foundation for reproducible, explainable, and accessible agentic forecasting and outlines future work on Model Context Protocol integration, domain expansion, and hierarchical/multivariate forecasting.

Abstract

We introduce TimeCopilot, the first open-source agentic framework for forecasting that combines multiple Time Series Foundation Models (TSFMs) with Large Language Models (LLMs) through a single unified API. TimeCopilot automates the forecasting pipeline: feature analysis, model selection, cross-validation, and forecast generation, while providing natural language explanations and supporting direct queries about the future. The framework is LLM-agnostic, compatible with both commercial and open-source models, and supports ensembles across diverse forecasting families. Results on the large-scale GIFT-Eval benchmark show that TimeCopilot achieves state-of-the-art probabilistic forecasting performance at low cost. Our framework provides a practical foundation for reproducible, explainable, and accessible agentic forecasting systems.

Paper Structure

This paper contains 9 sections, 2 figures.

Figures (2)

  • Figure 1: (Right) Overall TimeCopilot's architecture. (Left) TimeCopilot Agent API usage.
  • Figure 2: Performance of TimeCopilot and baseline models on the GIFT-Eval benchmark aksu2024giftevalbenchmarkgeneraltime. Lower values indicate better forecast performance. The top-16 models are shown, for each metric (CRPS and MASE). The top row shows the mean ranks across datasets, while the bottom row shows the corresponding mean scores.