Table of Contents
Fetching ...

Retrieval Augmented Time Series Forecasting

Kutay Tire, Ege Onur Taga, Muhammed Emrullah Ildiz, Samet Oymak

TL;DR

This paper advocates that the dynamic and event-driven nature of time-series data makes RAG a crucial component of TSFMs and introduces a principled RAG framework for time-series forecasting, called Retrieval Augmented Forecasting (RAF).

Abstract

Retrieval-augmented generation (RAG) is a central component of modern LLM systems, particularly in scenarios where up-to-date information is crucial for accurately responding to user queries or when queries exceed the scope of the training data. The advent of time-series foundation models (TSFM), such as Chronos, and the need for effective zero-shot forecasting performance across various time-series domains motivates the question: Do benefits of RAG similarly carry over to time series forecasting? In this paper, we advocate that the dynamic and event-driven nature of time-series data makes RAG a crucial component of TSFMs and introduce a principled RAG framework for time-series forecasting, called Retrieval Augmented Forecasting (RAF). Within RAF, we develop efficient strategies for retrieving related time-series examples and incorporating them into forecast. Through experiments and mechanistic studies, we demonstrate that RAF indeed improves the forecasting accuracy across diverse time series domains and the improvement is more significant for larger TSFM sizes.

Retrieval Augmented Time Series Forecasting

TL;DR

This paper advocates that the dynamic and event-driven nature of time-series data makes RAG a crucial component of TSFMs and introduces a principled RAG framework for time-series forecasting, called Retrieval Augmented Forecasting (RAF).

Abstract

Retrieval-augmented generation (RAG) is a central component of modern LLM systems, particularly in scenarios where up-to-date information is crucial for accurately responding to user queries or when queries exceed the scope of the training data. The advent of time-series foundation models (TSFM), such as Chronos, and the need for effective zero-shot forecasting performance across various time-series domains motivates the question: Do benefits of RAG similarly carry over to time series forecasting? In this paper, we advocate that the dynamic and event-driven nature of time-series data makes RAG a crucial component of TSFMs and introduce a principled RAG framework for time-series forecasting, called Retrieval Augmented Forecasting (RAF). Within RAF, we develop efficient strategies for retrieving related time-series examples and incorporating them into forecast. Through experiments and mechanistic studies, we demonstrate that RAF indeed improves the forecasting accuracy across diverse time series domains and the improvement is more significant for larger TSFM sizes.

Paper Structure

This paper contains 40 sections, 2 theorems, 1 equation, 8 figures, 17 tables.

Key Result

Theorem 1

A transformer architecture with two-attention blocks and absolute positional encoding can solve the Time-Series Retrieval (TS-R) problem by employing patch-embeddings with stride length 1 and by suitably encoding the norms and directions of the patches.

Figures (8)

  • Figure 1: Overview of the Retrieval Augmented Forecasting (RAF) framework. Top left: The original query is used to retrieve the best-matching time series (RTS 1, RTS 2, RTS 3, …). Bottom left: We utilize the best match (RTS 1) to form the retrieved context and retrieved future. Bottom right: These segments are then augmented with the original time series to produce an augmented input for forecasting. Top right figure displays the forecasts generated by Chronos Base. RAF outperforms the base model and returns a forecast closer to the actual future values.
  • Figure 2: We generated synthetic time-series data by transposing two sinusoidal signals and projecting them via orthogonal projections. We assessed extrapolation behavior using scaled mean squared error (assuming $0$ prediction as baseline) and chose a context and forecast length of $C=30$ and $H=30$. Evaluations were conducted on Chronos-{mini, small, base}.
  • Figure 3: Aggregated Relative WQL performance for Chronos Mini and Chronos Base across datasets and benchmarks. This figure illustrates the comparative analysis of WQL for two configurations—Chronos Mini and Chronos, highlighting the relative performance improvements when using RAF within each model configuration.
  • Figure 4: Aggregated Relative MASE performance for Chronos Mini and Chronos Base across datasets and benchmarks. This figure illustrates the comparative analysis of MASE for two configurations—Chronos Mini and Chronos, highlighting the relative performance improvements when using RAF within each model configuration.
  • Figure 5: Qualitative results for Benchmark i@ datasets with $C = 50$ and $H=15$ on Chronos Base.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Definition 1: TS-R problem
  • Theorem 1: informal
  • Theorem 2: TS-R Problem
  • Proof 1