Table of Contents
Fetching ...

SeqFusion: Sequential Fusion of Pre-Trained Models for Zero-Shot Time-Series Forecasting

Ting-Ji Huang, Xu-Yang Chen, Han-Jia Ye

TL;DR

SeqFusion tackles zero-shot time-series forecasting by avoiding large-scale in-task training data and centralized pre-training data. It builds a model zoo of lightweight PTMs trained on diverse datasets and uses a universal representation extractor to map both target series and PTMs into a shared space, enabling per-variate PTM selection via cosine similarity. Forecasting proceeds sequentially with selected PTMs and can optionally fuse top predictions to improve robustness, all while preserving privacy and minimizing storage. Across multivariate and univariate benchmarks, SeqFusion achieves competitive accuracy with substantially lower memory requirements than large pre-trained models, validating the approach's practicality for data-limited and privacy-sensitive applications.

Abstract

Unlike traditional time-series forecasting methods that require extensive in-task data for training, zero-shot forecasting can directly predict future values given a target time series without additional training data. Current zero-shot approaches primarily rely on pre-trained generalized models, with their performance often depending on the variety and relevance of the pre-training data, which can raise privacy concerns. Instead of collecting diverse pre-training data, we introduce SeqFusion in this work, a novel framework that collects and fuses diverse pre-trained models (PTMs) sequentially for zero-shot forecasting. Based on the specific temporal characteristics of the target time series, SeqFusion selects the most suitable PTMs from a batch of pre-collected PTMs, performs sequential predictions, and fuses all the predictions while using minimal data to protect privacy. Each of these PTMs specializes in different temporal patterns and forecasting tasks, allowing SeqFusion to select by measuring distances in a shared representation space of the target time series with each PTM. Experiments demonstrate that SeqFusion achieves competitive accuracy in zero-shot forecasting compared to state-of-the-art methods.

SeqFusion: Sequential Fusion of Pre-Trained Models for Zero-Shot Time-Series Forecasting

TL;DR

SeqFusion tackles zero-shot time-series forecasting by avoiding large-scale in-task training data and centralized pre-training data. It builds a model zoo of lightweight PTMs trained on diverse datasets and uses a universal representation extractor to map both target series and PTMs into a shared space, enabling per-variate PTM selection via cosine similarity. Forecasting proceeds sequentially with selected PTMs and can optionally fuse top predictions to improve robustness, all while preserving privacy and minimizing storage. Across multivariate and univariate benchmarks, SeqFusion achieves competitive accuracy with substantially lower memory requirements than large pre-trained models, validating the approach's practicality for data-limited and privacy-sensitive applications.

Abstract

Unlike traditional time-series forecasting methods that require extensive in-task data for training, zero-shot forecasting can directly predict future values given a target time series without additional training data. Current zero-shot approaches primarily rely on pre-trained generalized models, with their performance often depending on the variety and relevance of the pre-training data, which can raise privacy concerns. Instead of collecting diverse pre-training data, we introduce SeqFusion in this work, a novel framework that collects and fuses diverse pre-trained models (PTMs) sequentially for zero-shot forecasting. Based on the specific temporal characteristics of the target time series, SeqFusion selects the most suitable PTMs from a batch of pre-collected PTMs, performs sequential predictions, and fuses all the predictions while using minimal data to protect privacy. Each of these PTMs specializes in different temporal patterns and forecasting tasks, allowing SeqFusion to select by measuring distances in a shared representation space of the target time series with each PTM. Experiments demonstrate that SeqFusion achieves competitive accuracy in zero-shot forecasting compared to state-of-the-art methods.

Paper Structure

This paper contains 16 sections, 10 equations, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: Comparison of traditional zero-shot forecasting and SeqFusion. Traditional methods rely on sampling data and training, while SeqFusion leverages pre-trained models (PTMs) through matching the target time series to suitable PTMs and fusing their predictions.
  • Figure 2: Overview of SeqFusion. (a) SeqFusion collects diverse PTMs and selects the most suitable PTMs based on the characteristics of the target time series. This selection is based on the representations obtained from a general extractor, to provide a vector for measuring the similarity between the PTMs and each variate in the target time series. (b) SeqFusion employs sequential forecasting with the selected PTMs for each variate, and fuses all predictions over all variates to generate final forecasts. The sequential forecasting process includes normalization, trimming, and the optional aggregated prediction of the most suitable PTMs, followed by concatenation and de-normalization to produce the final forecasts.
  • Figure 3: (a) Architecture of the General Extractor. The encoder-decoder model extracts time-series representations optimized through self-supervised learning based on series-wise similarity and transferability loss to align time-series representations with PTMs' transferability. (b) Training Process of the General Extractor. In the Time-Series part, raw time-series data from pre-training datasets are masked and split to create input pairs. The encoder generates time-series representations while the decoder reconstructs the input. A series-wise similarity loss aligns representations of series from the same dataset and allows representations from different datasets to be far apart. In the PTMs Part, PTMs are evaluated for transferability using metrics like 1-MSE, and this information is integrated to refine representations, ensuring they reflect both the intrinsic characteristics of the time series and the ability of PTMs to generalize to related tasks.
  • Figure 4: (a) Visualization of the representations (repr.) of PTMs and target time series using PCA. Triangles represent PTMs, and circles represent variates from the target datasets, where we use similar colors to indicate different variate of the time series from one dataset. Variates from the same dataset cluster closely, and datasets with similar sampling frequencies (e.g., daily or weekly) are aligned with PTMs trained on datasets sharing these frequencies. (b) Violin plot of PTM performance distributions across datasets, with red “x” markers showing SeqFusion’s combined performance. SeqFusion consistently selects the most suitable PTMs in the model zoo (markers near the bottom), though its performance is limited by the quality and diversity of the model zoo.
  • Figure 5: Performance of SeqFusion with varying numbers of aggregated PTMs. Aggregating predictions from multiple PTMs reduces MSE across datasets.