A decoder-only foundation model for time-series forecasting
Abhimanyu Das, Weihao Kong, Rajat Sen, Yichen Zhou
TL;DR
This work presents TimesFM, a decoder-only, patch-based transformer designed as a time-series forecasting foundation model trained from scratch on a large, diverse mix of real and synthetic data. It demonstrates strong zero-shot forecasting across unseen datasets and granularities, achieving near state-of-the-art accuracy without dataset-specific fine-tuning. Key innovations include patch-based input/output handling, longer output patches for efficient horizon forecasting, and masking strategies to cover varying context lengths. The model’s large-scale pretraining, empirical validations, and planned open-release position it as a practical, general-purpose forecaster with broad real-world impact, while acknowledging ethical considerations and avenues for future enhancement (probabilistic forecasts, covariates, and finetuning).
Abstract
Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of public datasets comes close to the accuracy of state-of-the-art supervised forecasting models for each individual dataset. Our model is based on pretraining a patched-decoder style attention model on a large time-series corpus, and can work well across different forecasting history lengths, prediction lengths and temporal granularities.
