TRACE: Time SeRies PArameter EffiCient FinE-tuning
Yuze Li, Wei Zhu
TL;DR
TRACE presents a parameter-efficient fine-tuning framework for time series foundation models by coupling two innovations: reconstructed forecasting heads to dramatically reduce head parameters, and Gated Dynamic Simulation Importance Calculation (Gated DSIC) to debias LoRA module selection. The method enables dynamic, reversible pruning of LoRA components and applies a Monte Carlo masking scheme to simulate post-pruning contexts, achieving superior performance on long-horizon forecasting, short-horizon forecasting, anomaly detection, and even NLP benchmarks with minimal trainable parameters. Extensive experiments across diverse datasets demonstrate that TRACE consistently beats linear probing, full fine-tuning, and prior PEFT baselines, with statistical validation and favorable deployment costs. The results highlight TRACE’s practical impact for resource-constrained settings and its generalizability beyond time series to cross-domain sequential tasks.
Abstract
We propose an efficient fine-tuning method for time series foundation models, termed TRACE: Time Series Parameter Efficient Fine-tuning. While pretrained time series foundation models are gaining popularity, they face the following challenges: (1) Unlike natural language tasks, time series data vary in frequency, channel numbers, historical/prediction lengths. For long-term forecasting tasks in particular, tailored fine-tuning can significantly enhance performance.(2) Existing parameter-efficient tuning methods like LoRA remain applicable but require adaptation to temporal characteristics. To address these challenges, our TRACE framework introduces two key innovations: (1) Gated DSIC (Gated Dynamic Simulation Importance Calculation), an unbiased LoRA module importance selection mechanism that ensures conditional parameter consistency before and after masking. Experiments demonstrate that Gated DSIC outperforms common fine-tuning. (2) Reconstructed prediction heads for long-term forecasting tasks, which achieve comparable or superior performance to linear probing heads while drastically reducing parameter counts. Extensive experiments on long-/short-term forecasting, anomaly detection and natural language tasks across diverse datasets, coupled with ablation studies, validate the effectiveness of our method.
