R-Tuning: Wavelet-Decomposed Replay and Semantic Alignment for Continual Adaptation of Pretrained Time-Series Models
Tianyi Yin, Jingwei Wang, Chenze Wang, Han Wang, Jiexuan Cai, Min Liu, Yunlong Ma, Kun Gao, Yuting Song, Weiming Shen
TL;DR
R-Tuning tackles the problem of continual adaptation for pre-trained time-series forecasting models under limited data access by introducing two core components: Wavelet-guided Replay, which generates frequency-aware, diverse replay samples from a frozen old model, and Semantic Alignment via latent distillation, which preserves prior knowledge in a compact latent space. The method demonstrates substantial improvements on new tasks (up to 46% reductions in MAE/MSE) while maintaining or slightly improving performance on old tasks (up to ~6% gains), and remains effective in few-shot regimes with as little as 4–5% synthetic data. These results indicate a practical, data-efficient strategy for deploying and updating pre-trained time-series models in dynamic environments, with potential for online and multi-modal extensions.
Abstract
Pre-trained models have demonstrated exceptional generalization capabilities in time-series forecasting; however, adapting them to evolving data distributions remains a significant challenge. A key hurdle lies in accessing the original training data, as fine-tuning solely on new data often leads to catastrophic forgetting. To address this issue, we propose Replay Tuning (R-Tuning), a novel framework designed for the continual adaptation of pre-trained time-series models. R-Tuning constructs a unified latent space that captures both prior and current task knowledge through a frequency-aware replay strategy. Specifically, it augments model-generated samples via wavelet-based decomposition across multiple frequency bands, generating trend-preserving and fusion-enhanced variants to improve representation diversity and replay efficiency. To further reduce reliance on synthetic samples, R-Tuning introduces a latent consistency constraint that aligns new representations with the prior task space. This constraint guides joint optimization within a compact and semantically coherent latent space, ensuring robust knowledge retention and adaptation. Extensive experimental results demonstrate the superiority of R-Tuning, which reduces MAE and MSE by up to 46.9% and 46.8%, respectively, on new tasks, while preserving prior knowledge with gains of up to 5.7% and 6.0% on old tasks. Notably, under few-shot settings, R-Tuning outperforms all state-of-the-art baselines even when synthetic proxy samples account for only 5% of the new task dataset.
