Table of Contents
Fetching ...

Learning Novel Transformer Architecture for Time-series Forecasting

Juyuan Zhang, Wei Zhu, Jiechao Gao

TL;DR

AutoFormer-TS introduces a comprehensive Transformer search space for time-series forecasting and a novel AB-DARTS differentiable NAS method to identify high-performing architectures. By exploring diverse self-attention variants, FFN configurations, and encoding operations, the framework learns task-specific, state-of-the-art architectures with efficient search. Empirical results across multiple long- and short-term forecasting benchmarks show consistent improvements over existing SOTA baselines, including PatchTST and pretrained-time-series models, while maintaining practical training times. The work suggests that combining a rich architectural search space with ablation-guided operation selection yields tangible gains in forecasting accuracy and efficiency for time-series applications.

Abstract

Despite the success of Transformer-based models in the time-series prediction (TSP) tasks, the existing Transformer architecture still face limitations and the literature lacks comprehensive explorations into alternative architectures. To address these challenges, we propose AutoFormer-TS, a novel framework that leverages a comprehensive search space for Transformer architectures tailored to TSP tasks. Our framework introduces a differentiable neural architecture search (DNAS) method, AB-DARTS, which improves upon existing DNAS approaches by enhancing the identification of optimal operations within the architecture. AutoFormer-TS systematically explores alternative attention mechanisms, activation functions, and encoding operations, moving beyond the traditional Transformer design. Extensive experiments demonstrate that AutoFormer-TS consistently outperforms state-of-the-art baselines across various TSP benchmarks, achieving superior forecasting accuracy while maintaining reasonable training efficiency.

Learning Novel Transformer Architecture for Time-series Forecasting

TL;DR

AutoFormer-TS introduces a comprehensive Transformer search space for time-series forecasting and a novel AB-DARTS differentiable NAS method to identify high-performing architectures. By exploring diverse self-attention variants, FFN configurations, and encoding operations, the framework learns task-specific, state-of-the-art architectures with efficient search. Empirical results across multiple long- and short-term forecasting benchmarks show consistent improvements over existing SOTA baselines, including PatchTST and pretrained-time-series models, while maintaining practical training times. The work suggests that combining a rich architectural search space with ablation-guided operation selection yields tangible gains in forecasting accuracy and efficiency for time-series applications.

Abstract

Despite the success of Transformer-based models in the time-series prediction (TSP) tasks, the existing Transformer architecture still face limitations and the literature lacks comprehensive explorations into alternative architectures. To address these challenges, we propose AutoFormer-TS, a novel framework that leverages a comprehensive search space for Transformer architectures tailored to TSP tasks. Our framework introduces a differentiable neural architecture search (DNAS) method, AB-DARTS, which improves upon existing DNAS approaches by enhancing the identification of optimal operations within the architecture. AutoFormer-TS systematically explores alternative attention mechanisms, activation functions, and encoding operations, moving beyond the traditional Transformer design. Extensive experiments demonstrate that AutoFormer-TS consistently outperforms state-of-the-art baselines across various TSP benchmarks, achieving superior forecasting accuracy while maintaining reasonable training efficiency.

Paper Structure

This paper contains 26 sections, 9 equations, 2 figures, 8 tables, 1 algorithm.

Figures (2)

  • Figure 1: The architecture for our AutoFormer-TS framework .
  • Figure 2: The learned architectures on the ETTh1, ETTm1 and M4 tasks.