Pre-trained Forecasting Models: Strong Zero-Shot Feature Extractors for Time Series Classification
Andreas Auer, Daniel Klotz, Sebastinan Böck, Sepp Hochreiter
TL;DR
This paper questions whether time series forecasting pre-training yields generalizable representations for time series classification. It proposes a zero-shot pipeline using a frozen forecasting encoder to extract embeddings, with a simple classifier on top, and investigates layer/sequence aggregation plus two augmentations to improve transferability. The findings show strong zero-shot performance of forecasting models, sometimes surpassing classification-focused pre-training, and reveal a positive link between forecasting quality and classification performance. The results advocate learning-to-forecast as a viable route to general-purpose time series foundation models.
Abstract
Recent research on time series foundation models has primarily focused on forecasting, leaving it unclear how generalizable their learned representations are. In this study, we examine whether frozen pre-trained forecasting models can provide effective representations for classification. To this end, we compare different representation extraction strategies and introduce two model-agnostic embedding augmentations. Our experiments show that the best forecasting models achieve classification accuracy that matches or even surpasses that of state-of-the-art models pre-trained specifically for classification. Moreover, we observe a positive correlation between forecasting and classification performance. These findings challenge the assumption that task-specific pre-training is necessary, and suggest that learning to forecast may provide a powerful route toward constructing general-purpose time series foundation models.
