Table of Contents
Fetching ...

MOMEMTO: Patch-based Memory Gate Model in Time Series Foundation Model

Samuel Yoon, Jongwon Kim, Juyoung Ha, Young Myoung Ko

TL;DR

MOMEMTO addresses cross-domain time series anomaly detection by integrating a patch-based memory module with a pre-trained MOMENT encoder to curb over-generalization. The approach supports multi-domain training, enabling a single model to learn domain-general normal patterns while maintaining patch-level semantics for accurate reconstruction. Empirical results across 23 univariate datasets show that MOMEMTO often outperforms its backbone MOMENT and other baselines, with notable gains in few-shot scenarios and across domains. The work demonstrates improved detection robustness, efficiency, and cross-domain knowledge sharing, offering a practical pathway to scalable time series anomaly detection in heterogeneous environments.

Abstract

Recently reconstruction-based deep models have been widely used for time series anomaly detection, but as their capacity and generalization capability increase, these models tend to over-generalize, often reconstructing unseen anomalies accurately. Prior works have attempted to mitigate this by incorporating a memory architecture that stores prototypes of normal patterns. Nevertheless, these approaches suffer from high training costs and have yet to be effectively integrated with time series foundation models (TFMs). To address these challenges, we propose MOMEMTO, an improved variant of TFM for anomaly detection, enhanced with a patch-based memory module to mitigate over-generalization. The memory module is designed to capture representative normal patterns from multiple domains and enables a single model to be jointly fine-tuned across multiple datasets through a multi-domain training strategy. MOMEMTO initializes memory items with latent representations from a pre-trained encoder, organizes them into patch-level units, and updates them via an attention mechanism. We evaluate our method using 23 univariate benchmark datasets. Experimental results demonstrate that MOMEMTO, as a single model, achieves higher scores on AUC and VUS metrics compared to baseline methods, and further enhances the performance of its backbone TFM, particularly in few-shot learning scenarios.

MOMEMTO: Patch-based Memory Gate Model in Time Series Foundation Model

TL;DR

MOMEMTO addresses cross-domain time series anomaly detection by integrating a patch-based memory module with a pre-trained MOMENT encoder to curb over-generalization. The approach supports multi-domain training, enabling a single model to learn domain-general normal patterns while maintaining patch-level semantics for accurate reconstruction. Empirical results across 23 univariate datasets show that MOMEMTO often outperforms its backbone MOMENT and other baselines, with notable gains in few-shot scenarios and across domains. The work demonstrates improved detection robustness, efficiency, and cross-domain knowledge sharing, offering a practical pathway to scalable time series anomaly detection in heterogeneous environments.

Abstract

Recently reconstruction-based deep models have been widely used for time series anomaly detection, but as their capacity and generalization capability increase, these models tend to over-generalize, often reconstructing unseen anomalies accurately. Prior works have attempted to mitigate this by incorporating a memory architecture that stores prototypes of normal patterns. Nevertheless, these approaches suffer from high training costs and have yet to be effectively integrated with time series foundation models (TFMs). To address these challenges, we propose MOMEMTO, an improved variant of TFM for anomaly detection, enhanced with a patch-based memory module to mitigate over-generalization. The memory module is designed to capture representative normal patterns from multiple domains and enables a single model to be jointly fine-tuned across multiple datasets through a multi-domain training strategy. MOMEMTO initializes memory items with latent representations from a pre-trained encoder, organizes them into patch-level units, and updates them via an attention mechanism. We evaluate our method using 23 univariate benchmark datasets. Experimental results demonstrate that MOMEMTO, as a single model, achieves higher scores on AUC and VUS metrics compared to baseline methods, and further enhances the performance of its backbone TFM, particularly in few-shot learning scenarios.

Paper Structure

This paper contains 37 sections, 4 equations, 7 figures, 12 tables, 2 algorithms.

Figures (7)

  • Figure 1: Architecture of MOMEMTO.
  • Figure 2: Effect of the training data ratio on AUC-PR.
  • Figure 3: Comparison of anomaly score distributions between normal (blue) and anomalous (red) samples across a subset of domains. The upper row shows the results obtained with the backbone model MOMENT, while the lower row corresponds to our proposed model.
  • Figure 4: Visualization of row-normalized accumulated domain-to-memory similarity over the entire training process (left) and at test time (right). Each heatmap shows how subsequences from a true domain reference memory items across domains.
  • Figure 5: Effect of the number of referenced memory items ($K$) and the training data ratio on AUC-PR.
  • ...and 2 more figures