Are Time Series Foundation Models Susceptible to Catastrophic Forgetting?
Nouha Karaouli, Denis Coquenet, Elisa Fromont, Martial Mermillod, Marina Reyboz
TL;DR
This paper investigates catastrophic forgetting in Time Series Foundation Models (TSFMs) during continual learning. It introduces a two-stage fine-tuning protocol on synthetic, multi-sinusoidal time series to quantify forgetting via MAE and backward transfer (BWT). The results reveal a clear stability-plasticity trade-off: higher learning rates and longer fine-tuning boost adaptation to new data but erode performance on previously learned tasks; an intermediate learning-rate around 1\times 10^{-5} with 5–10 epochs offers the best balance but does not fully prevent forgetting. The findings highlight the need for continual-learning methods tailored to time-series foundation models to enable reliable sequential adaptation in non-stationary environments.
Abstract
Time Series Foundation Models (TSFMs) have shown promising zero-shot generalization across diverse forecasting tasks. However, their robustness to continual adaptation remains underexplored. In this work, we investigate the extent to which TSFMs suffer from catastrophic forgetting when fine-tuned sequentially on multiple datasets. Using synthetic datasets designed with varying degrees of periodic structure, we measure the trade-off between adaptation to new data and retention of prior knowledge. Our experiments reveal that, while fine-tuning improves performance on new tasks, it often causes significant degradation on previously learned ones, illustrating a fundamental stability-plasticity dilemma.
