Table of Contents
Fetching ...

Evaluating Time Series Foundation Models on Noisy Periodic Time Series

Syamantak Datta Gupta

TL;DR

This work addresses evaluating zero-shot, long-horizon forecasting by time series foundation models (TSFMs) on noisy periodic data. It uses synthetic data generated as sums of sinusoids with additive Gaussian noise and benchmarks against a Fourier-transform-based reconstruction (FFT) and a linear autoregressive (AR) baseline. Key findings show CHRONOS and TimesFM can outperform FFT and unregularized AR under high sampling rates and bounded periods, but all TSFMs struggle with very long periods, high noise, and low sampling rates, with AR often dominating in such regimes. The results highlight the potential and limitations of current TSFMs for time-series forecasting and motivate broader, more diverse evaluations and targeted fine-tuning.

Abstract

While recent advancements in foundation models have significantly impacted machine learning, rigorous tests on the performance of time series foundation models (TSFMs) remain largely underexplored. This paper presents an empirical study evaluating the zero-shot, long-horizon forecasting abilities of several leading TSFMs over two synthetic datasets constituting noisy periodic time series. We assess model efficacy across different noise levels, underlying frequencies, and sampling rates. As benchmarks for comparison, we choose two statistical techniques: a Fourier transform (FFT)-based approach and a linear autoregressive (AR) model. Our findings demonstrate that while for time series with bounded periods and higher sampling rates, TSFMs can match or outperform the statistical approaches, their forecasting abilities deteriorate with longer periods, higher noise levels, lower sampling rates and more complex shapes of the time series.

Evaluating Time Series Foundation Models on Noisy Periodic Time Series

TL;DR

This work addresses evaluating zero-shot, long-horizon forecasting by time series foundation models (TSFMs) on noisy periodic data. It uses synthetic data generated as sums of sinusoids with additive Gaussian noise and benchmarks against a Fourier-transform-based reconstruction (FFT) and a linear autoregressive (AR) baseline. Key findings show CHRONOS and TimesFM can outperform FFT and unregularized AR under high sampling rates and bounded periods, but all TSFMs struggle with very long periods, high noise, and low sampling rates, with AR often dominating in such regimes. The results highlight the potential and limitations of current TSFMs for time-series forecasting and motivate broader, more diverse evaluations and targeted fine-tuning.

Abstract

While recent advancements in foundation models have significantly impacted machine learning, rigorous tests on the performance of time series foundation models (TSFMs) remain largely underexplored. This paper presents an empirical study evaluating the zero-shot, long-horizon forecasting abilities of several leading TSFMs over two synthetic datasets constituting noisy periodic time series. We assess model efficacy across different noise levels, underlying frequencies, and sampling rates. As benchmarks for comparison, we choose two statistical techniques: a Fourier transform (FFT)-based approach and a linear autoregressive (AR) model. Our findings demonstrate that while for time series with bounded periods and higher sampling rates, TSFMs can match or outperform the statistical approaches, their forecasting abilities deteriorate with longer periods, higher noise levels, lower sampling rates and more complex shapes of the time series.
Paper Structure (8 sections, 3 equations, 8 figures, 1 table)

This paper contains 8 sections, 3 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Average Mean-Squared Error as a function of number of sinusoidal components, signal-to-noise ratio and sampling ratio. Top row represents Set A and bottom row represents Set B
  • Figure 2: Boxplots of mean-squared errors and outliers, left: set A, right: set B
  • Figure 3: Example time series and forecasts (context window truncated) CHRONOS-base and AR, left: set A, right: set B. The CHRONOS model beats AR for the smoother time series in set A but misses the pattern for the zig-zag series in set B, resulting from a lower sampling rate
  • Figure 4: Median Mean-Squared Error as a function of number of sinusoidal components, signal-to-noise ratio and sampling ratio. Top row represents Set A and bottom row represents Set B
  • Figure 5: Median Mean Absolute Error as a function of number of sinusoidal components, signal-to-noise ratio and sampling ratio. Top row represents Set A and bottom row represents Set B
  • ...and 3 more figures