Table of Contents
Fetching ...

MOMENT: A Family of Open Time-series Foundation Models

Mononito Goswami, Konrad Szafer, Arjun Choudhry, Yifu Cai, Shuo Li, Artur Dubrawski

TL;DR

MOMENT presents the first open family of time-series foundation models trained from a large, diverse public data pile called The Time Series Pile, addressing the lack of cohesive public ts data and evaluation benchmarks. It uses a patch-based transformer with masked time-series modeling to pre-train representations that are effective across forecasting, classification, anomaly detection, and imputation, even under zero-shot and limited supervision. The authors provide a rigorous benchmark extending prior work and show that MOMENT can outperform several baselines on multiple tasks, while also offering insights into model scaling and cross-modal transfer. They emphasize open science by releasing data, models, and code, and suggest future work in multi-modal time-series and causal forecasting objectives. Overall, MOMENT demonstrates the feasibility and value of large-scale, open, time-series foundation models for practical analysis under resource constraints.

Abstract

We introduce MOMENT, a family of open-source foundation models for general-purpose time series analysis. Pre-training large models on time series data is challenging due to (1) the absence of a large and cohesive public time series repository, and (2) diverse time series characteristics which make multi-dataset training onerous. Additionally, (3) experimental benchmarks to evaluate these models, especially in scenarios with limited resources, time, and supervision, are still in their nascent stages. To address these challenges, we compile a large and diverse collection of public time series, called the Time series Pile, and systematically tackle time series-specific challenges to unlock large-scale multi-dataset pre-training. Finally, we build on recent work to design a benchmark to evaluate time series foundation models on diverse tasks and datasets in limited supervision settings. Experiments on this benchmark demonstrate the effectiveness of our pre-trained models with minimal data and task-specific fine-tuning. Finally, we present several interesting empirical observations about large pre-trained time series models. Pre-trained models (AutonLab/MOMENT-1-large) and Time Series Pile (AutonLab/Timeseries-PILE) are available on Huggingface.

MOMENT: A Family of Open Time-series Foundation Models

TL;DR

MOMENT presents the first open family of time-series foundation models trained from a large, diverse public data pile called The Time Series Pile, addressing the lack of cohesive public ts data and evaluation benchmarks. It uses a patch-based transformer with masked time-series modeling to pre-train representations that are effective across forecasting, classification, anomaly detection, and imputation, even under zero-shot and limited supervision. The authors provide a rigorous benchmark extending prior work and show that MOMENT can outperform several baselines on multiple tasks, while also offering insights into model scaling and cross-modal transfer. They emphasize open science by releasing data, models, and code, and suggest future work in multi-modal time-series and causal forecasting objectives. Overall, MOMENT demonstrates the feasibility and value of large-scale, open, time-series foundation models for practical analysis under resource constraints.

Abstract

We introduce MOMENT, a family of open-source foundation models for general-purpose time series analysis. Pre-training large models on time series data is challenging due to (1) the absence of a large and cohesive public time series repository, and (2) diverse time series characteristics which make multi-dataset training onerous. Additionally, (3) experimental benchmarks to evaluate these models, especially in scenarios with limited resources, time, and supervision, are still in their nascent stages. To address these challenges, we compile a large and diverse collection of public time series, called the Time series Pile, and systematically tackle time series-specific challenges to unlock large-scale multi-dataset pre-training. Finally, we build on recent work to design a benchmark to evaluate time series foundation models on diverse tasks and datasets in limited supervision settings. Experiments on this benchmark demonstrate the effectiveness of our pre-trained models with minimal data and task-specific fine-tuning. Finally, we present several interesting empirical observations about large pre-trained time series models. Pre-trained models (AutonLab/MOMENT-1-large) and Time Series Pile (AutonLab/Timeseries-PILE) are available on Huggingface.
Paper Structure (60 sections, 12 figures, 34 tables)

This paper contains 60 sections, 12 figures, 34 tables.

Figures (12)

  • Figure 1: MOMENT can solve multiple time series analysis tasks well (App. \ref{['app:experimental-setup-and-results']}).
  • Figure 2: Time Series Pile data splits. To avoid data contamination, we carefully partition all datasets into disjoint train, validation, and test splits. We adhere to the predefined splits provided by the creators of each dataset. In cases where such splits are unavailable, we randomly sample 60% of the data for training, 10% for validation, and 30% for testing. We only use the training splits of all datasets for pre-training.
  • Figure 3: Overview of MOMENT. A time series is broken into disjoint fixed-length sub-sequences called patches, and each patch is mapped into a $D$-dimensional patch embedding. During pre-training, we mask patches uniformly at random by replacing their patch embeddings using a special mask embedding [MASK]. The goal of pre-training is to learn patch embeddings which can be used to reconstruct the input time series using a light-weight reconstruction head.
  • Figure 4: What is MOMENT learning? Principal components of the embeddings of synthetically generated sinusoids suggest that MOMENT can capture subtle trend, scale, frequency, and phase information. In each experiment, $c$ controls the factor of interest, for example the power of the trend polynomial $c \in (\frac{1}{8}, 8)$N-BEATS (Fig. \ref{['fig:interpretability_timeseries']}), and frequency $c \in (1, 32)$ of the generated sine waves (Fig. \ref{['fig:interpretability_timeseries']}). We generate multiple sine waves by varying $c$, derive their sequence-level representations using MOMENT, and visualize them in a 2- dimensional space using PCA and t-SNE tsne in Fig. \ref{['fig:interpretability']} and Fig. \ref{['fig:interpretability_appendix']}.
  • Figure 5: PCA and t-SNE visualizations of representations learned by MOMENT on the 3 largest UCR datasets. Different colors represent different classes. Even without dataset-specific fine-tuning, MOMENT learns distinct representations for different classes.
  • ...and 7 more figures