TS-Memory: Plug-and-Play Memory for Time Series Foundation Models

Sisuo Lyu; Siru Zhong; Tiegang Chen; Weilin Ruan; Qingxiang Liu; Taiqiang Lv; Qingsong Wen; Raymond Chi-Wing Wong; Yuxuan Liang

TS-Memory: Plug-and-Play Memory for Time Series Foundation Models

Sisuo Lyu, Siru Zhong, Tiegang Chen, Weilin Ruan, Qingxiang Liu, Taiqiang Lv, Qingsong Wen, Raymond Chi-Wing Wong, Yuxuan Liang

TL;DR

TS-Memory addresses the challenge of adapting Time Series Foundation Models to distribution-shifted domains without incurring repeated retrieval latency or maintaining multiple domain-specific backbones. It distills offline, leakage-safe kNN retrieval signals into a lightweight parametric memory that can be fused with frozen backbones in constant time during inference. The two-stage training combines privileged distributional supervision with confidence-gated distillation, yielding robust improvements in both point and probabilistic forecasts across diverse TSFMs and datasets, while preserving retrieval-free, low-latency deployment. Empirically, TS-Memory outperforms both parametric adapters and online retrieval baselines with negligible overhead, demonstrating practical impact for scalable time-series forecasting under distribution shift.

Abstract

Time Series Foundation Models (TSFMs) achieve strong zero-shot forecasting through large-scale pre-training, but adapting them to downstream domains under distribution shift remains challenging. Existing solutions face a trade-off: Parametric Adaptation can cause catastrophic forgetting and requires costly multi-domain maintenance, while Non-Parametric Retrieval improves forecasts but incurs high inference latency due to datastore search. We propose Parametric Memory Distillation and implement it as TS-Memory, a lightweight memory adapter that augments frozen TSFMs. TS-Memory is trained in two stages. First, we construct an offline, leakage-safe kNN teacher that synthesizes confidence-aware quantile targets from retrieved futures. Second, we distill this retrieval-induced distributional correction into a lightweight memory adapter via confidence-gated supervision. During inference, TS-Memory fuses memory and backbone predictions with constant-time overhead, enabling retrieval-free deployment. Experiments across diverse TSFMs and benchmarks demonstrate consistent improvements in both point and probabilistic forecasting over representative adaptation methods, with efficiency comparable to the frozen backbone.

TS-Memory: Plug-and-Play Memory for Time Series Foundation Models

TL;DR

Abstract

Paper Structure (31 sections, 40 equations, 7 figures, 15 tables, 1 algorithm)

This paper contains 31 sections, 40 equations, 7 figures, 15 tables, 1 algorithm.

Introduction
Related Work
Problem Definition
Methodology
Privileged Supervision Construction
Confidence-Gated Memory Distillation
Inference via Adaptive Fusion
Experiment
Experimental Setup
TS-Memory Performance Across Backbones
Comparison with Adaptation Baselines
Transfer generality
Model Analysis
Conclusion and Future Work
Leakage-Safe Retrieval Teacher Construction
...and 16 more sections

Figures (7)

Figure 1: Comparison of TSFM adaptation paradigms: (a) Parametric Adaptation; (b) Non-Parametric Retrieval; (c) Parametric Memory Distillation (Ours).
Figure 2: TS-Memory framework.
Figure 3: TS-Memory vs. LoRA under different train-test domains. Full per-dataset results are provided in Table \ref{['tab:domain_split_lora_tsmemory']}.
Figure 4: Ablation study of TS-Memory components.
Figure 5: Scaling Analysis of PlugMem Capacity.
...and 2 more figures

TS-Memory: Plug-and-Play Memory for Time Series Foundation Models

TL;DR

Abstract

TS-Memory: Plug-and-Play Memory for Time Series Foundation Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)