Unified Training of Universal Time Series Forecasting Transformers
Gerald Woo, Chenghao Liu, Akshat Kumar, Caiming Xiong, Silvio Savarese, Doyen Sahoo
TL;DR
The paper tackles the challenge of universal time series forecasting by introducing Moirai, a masked encoder Transformer that handles cross-frequency data, arbitrary variates, and flexible probabilistic outputs. It relies on LOTSA, a large-scale open time series archive, to pre-train a single model capable of zero-shot forecasting across diverse datasets. Empirical results demonstrate competitive or superior zero-shot performance in both in-distribution and out-of-distribution settings, including probabilistic and long-horizon forecasts, with extensive ablations confirming the value of multi patch sizes, any-variate attention, and a mixture distribution head. The work highlights the potential of unified training for LTMs and outlines practical considerations and future directions for scaling and multi-modality.
Abstract
Deep learning for time series forecasting has traditionally operated within a one-model-per-dataset framework, limiting its potential to leverage the game-changing impact of large pre-trained models. The concept of universal forecasting, emerging from pre-training on a vast collection of time series datasets, envisions a single Large Time Series Model capable of addressing diverse downstream forecasting tasks. However, constructing such a model poses unique challenges specific to time series data: i) cross-frequency learning, ii) accommodating an arbitrary number of variates for multivariate time series, and iii) addressing the varying distributional properties inherent in large-scale data. To address these challenges, we present novel enhancements to the conventional time series Transformer architecture, resulting in our proposed Masked Encoder-based Universal Time Series Forecasting Transformer (Moirai). Trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains, Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models. Code, data, and model weights can be found at https://github.com/SalesforceAIResearch/uni2ts.
