Table of Contents
Fetching ...

Are Time Series Foundation Models Susceptible to Catastrophic Forgetting?

Nouha Karaouli, Denis Coquenet, Elisa Fromont, Martial Mermillod, Marina Reyboz

TL;DR

This paper investigates catastrophic forgetting in Time Series Foundation Models (TSFMs) during continual learning. It introduces a two-stage fine-tuning protocol on synthetic, multi-sinusoidal time series to quantify forgetting via MAE and backward transfer (BWT). The results reveal a clear stability-plasticity trade-off: higher learning rates and longer fine-tuning boost adaptation to new data but erode performance on previously learned tasks; an intermediate learning-rate around 1\times 10^{-5} with 5–10 epochs offers the best balance but does not fully prevent forgetting. The findings highlight the need for continual-learning methods tailored to time-series foundation models to enable reliable sequential adaptation in non-stationary environments.

Abstract

Time Series Foundation Models (TSFMs) have shown promising zero-shot generalization across diverse forecasting tasks. However, their robustness to continual adaptation remains underexplored. In this work, we investigate the extent to which TSFMs suffer from catastrophic forgetting when fine-tuned sequentially on multiple datasets. Using synthetic datasets designed with varying degrees of periodic structure, we measure the trade-off between adaptation to new data and retention of prior knowledge. Our experiments reveal that, while fine-tuning improves performance on new tasks, it often causes significant degradation on previously learned ones, illustrating a fundamental stability-plasticity dilemma.

Are Time Series Foundation Models Susceptible to Catastrophic Forgetting?

TL;DR

This paper investigates catastrophic forgetting in Time Series Foundation Models (TSFMs) during continual learning. It introduces a two-stage fine-tuning protocol on synthetic, multi-sinusoidal time series to quantify forgetting via MAE and backward transfer (BWT). The results reveal a clear stability-plasticity trade-off: higher learning rates and longer fine-tuning boost adaptation to new data but erode performance on previously learned tasks; an intermediate learning-rate around 1\times 10^{-5} with 5–10 epochs offers the best balance but does not fully prevent forgetting. The findings highlight the need for continual-learning methods tailored to time-series foundation models to enable reliable sequential adaptation in non-stationary environments.

Abstract

Time Series Foundation Models (TSFMs) have shown promising zero-shot generalization across diverse forecasting tasks. However, their robustness to continual adaptation remains underexplored. In this work, we investigate the extent to which TSFMs suffer from catastrophic forgetting when fine-tuned sequentially on multiple datasets. Using synthetic datasets designed with varying degrees of periodic structure, we measure the trade-off between adaptation to new data and retention of prior knowledge. Our experiments reveal that, while fine-tuning improves performance on new tasks, it often causes significant degradation on previously learned ones, illustrating a fundamental stability-plasticity dilemma.

Paper Structure

This paper contains 6 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Forecasting results on D1 and D2 at each fine-tuning stage. left Panel shows degradation on D1 due to catastrophic forgetting, while right panel illustrates improved adaptation to D2 after fine-tuning.