Table of Contents
Fetching ...

Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift

Yanru Sun, Zongxia Xie, Emadeldeen Eldele, Dongyue Chen, Qinghua Hu, Min Wu

TL;DR

The paper tackles patch-level distribution shifts in multivariate time-series forecasting by introducing TFPS, a framework that fuses a time–frequency dual-domain encoder with pattern-aware routing. A Pattern Identifier based on subspace clustering partitions patches into latent patterns and a Mixture of Pattern Experts assigns patch-specific predictors, enabling adaptation to evolving temporal patterns. Empirical results across nine real-world datasets show TFPS achieves state-of-the-art or competitive performance, particularly for long-horizon forecasts, and its pattern-aware design yields interpretability via expert specialization. The approach provides a practical, efficient alternative to large foundation models, with strong robustness to concept drift and non-stationarity. Overall, TFPS advances time-series forecasting by explicitly modeling distribution heterogeneity across patches and routing predictions through pattern-specific experts.

Abstract

Time series forecasting, which aims to predict future values based on historical data, has garnered significant attention due to its broad range of applications. However, real-world time series often exhibit complex non-uniform distribution with varying patterns across segments, such as season, operating condition, or semantic meaning, making accurate forecasting challenging. Existing approaches, which typically train a single model to capture all these diverse patterns, often struggle with the pattern drifts between patches and may lead to poor generalization. To address these challenges, we propose TFPS, a novel architecture that leverages pattern-specific experts for more accurate and adaptable time series forecasting. TFPS employs a dual-domain encoder to capture both time-domain and frequency-domain features, enabling a more comprehensive understanding of temporal dynamics. It then uses subspace clustering to dynamically identify distinct patterns across data patches. Finally, pattern-specific experts model these unique patterns, delivering tailored predictions for each patch. By explicitly learning and adapting to evolving patterns, TFPS achieves significantly improved forecasting accuracy. Extensive experiments on real-world datasets demonstrate that TFPS outperforms state-of-the-art methods, particularly in long-term forecasting, through its dynamic and pattern-aware learning approach. The data and codes are available: https://github.com/syrGitHub/TFPS.

Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift

TL;DR

The paper tackles patch-level distribution shifts in multivariate time-series forecasting by introducing TFPS, a framework that fuses a time–frequency dual-domain encoder with pattern-aware routing. A Pattern Identifier based on subspace clustering partitions patches into latent patterns and a Mixture of Pattern Experts assigns patch-specific predictors, enabling adaptation to evolving temporal patterns. Empirical results across nine real-world datasets show TFPS achieves state-of-the-art or competitive performance, particularly for long-horizon forecasts, and its pattern-aware design yields interpretability via expert specialization. The approach provides a practical, efficient alternative to large foundation models, with strong robustness to concept drift and non-stationarity. Overall, TFPS advances time-series forecasting by explicitly modeling distribution heterogeneity across patches and routing predictions through pattern-specific experts.

Abstract

Time series forecasting, which aims to predict future values based on historical data, has garnered significant attention due to its broad range of applications. However, real-world time series often exhibit complex non-uniform distribution with varying patterns across segments, such as season, operating condition, or semantic meaning, making accurate forecasting challenging. Existing approaches, which typically train a single model to capture all these diverse patterns, often struggle with the pattern drifts between patches and may lead to poor generalization. To address these challenges, we propose TFPS, a novel architecture that leverages pattern-specific experts for more accurate and adaptable time series forecasting. TFPS employs a dual-domain encoder to capture both time-domain and frequency-domain features, enabling a more comprehensive understanding of temporal dynamics. It then uses subspace clustering to dynamically identify distinct patterns across data patches. Finally, pattern-specific experts model these unique patterns, delivering tailored predictions for each patch. By explicitly learning and adapting to evolving patterns, TFPS achieves significantly improved forecasting accuracy. Extensive experiments on real-world datasets demonstrate that TFPS outperforms state-of-the-art methods, particularly in long-term forecasting, through its dynamic and pattern-aware learning approach. The data and codes are available: https://github.com/syrGitHub/TFPS.

Paper Structure

This paper contains 43 sections, 18 equations, 11 figures, 16 tables, 1 algorithm.

Figures (11)

  • Figure 1: Illustration of distribution shifts between time series patches on the ETTh1 dataset, quantified by Wasserstein distance. The combined time- and frequency-domain views reveal richer and more complementary shift patterns arising from temporal non-stationarity.
  • Figure 2: The structure of our proposed TFPS. The input time series is divided into patches, and positional embeddings are added. These embeddings are processed through two branches: time-domain branch and frequency-domain branch. Each branch consists of three key components: (1) an encoder to capture patch-wise features, (2) a clustering mechanism to identify patches with similar patterns, and (3) a mixture of pattern experts block to model the patterns of each cluster. Finally, the outputs from both branches are combined for the final prediction.
  • Figure 3: Illustration of the proposed Pattern Identifier and Mixture of Pattern Experts. The embedded representation $\mathbf{z}$ from DDE combines with subspace $\mathbf{D}$ to construct the subspace affinity vector, which yields the normalized subspace affinity $S$. Subsequently, the refined subspace affinity $\hat{S}$ is computed from $S$ to provide self-supervised information. Then, we assign the corresponding patch-wise experts to the embedded representation $\mathbf{z}$ according to $S$ for modeling.
  • Figure 4: Visualizations of DLinear and TFPS on the ETTh1 dataset when $H = 192$.
  • Figure 5: Interpretable patterns via PI. Expert-0 specializes in downward trends, while Expert-4 focuses on parabolic trends.
  • ...and 6 more figures