Table of Contents
Fetching ...

Optimizing Time Series Forecasting Architectures: A Hierarchical Neural Architecture Search Approach

Difan Deng, Marius Lindauer

TL;DR

The paper tackles the lack of robust, cross-family architecture design for time series forecasting by introducing a hierarchical neural architecture search (DARTS-TS) that unifies Seq Net and Flat Net modules into a heterogeneous search space. It employs a differentiable one-shot NAS approach with hierarchical pruning to automatically discover lightweight yet high-performing forecasting architectures across long-horizon tasks. Empirical results on Weather, Traffic, Exchange, ECL, ETTh/ETTm, and PEMS demonstrate competitive accuracy with substantially reduced compute and memory than several strong hand-crafted baselines, while revealing limitations of zero-cost proxies in heterogeneous spaces. The work advances automated forecasting by enabling efficient, task-adaptive architecture search and provides insights into the diverse role of operation families in different forecasting contexts.

Abstract

The rapid development of time series forecasting research has brought many deep learning-based modules in this field. However, despite the increasing amount of new forecasting architectures, it is still unclear if we have leveraged the full potential of these existing modules within a properly designed architecture. In this work, we propose a novel hierarchical neural architecture search approach for time series forecasting tasks. With the design of a hierarchical search space, we incorporate many architecture types designed for forecasting tasks and allow for the efficient combination of different forecasting architecture modules. Results on long-term-time-series-forecasting tasks show that our approach can search for lightweight high-performing forecasting architectures across different forecasting tasks.

Optimizing Time Series Forecasting Architectures: A Hierarchical Neural Architecture Search Approach

TL;DR

The paper tackles the lack of robust, cross-family architecture design for time series forecasting by introducing a hierarchical neural architecture search (DARTS-TS) that unifies Seq Net and Flat Net modules into a heterogeneous search space. It employs a differentiable one-shot NAS approach with hierarchical pruning to automatically discover lightweight yet high-performing forecasting architectures across long-horizon tasks. Empirical results on Weather, Traffic, Exchange, ECL, ETTh/ETTm, and PEMS demonstrate competitive accuracy with substantially reduced compute and memory than several strong hand-crafted baselines, while revealing limitations of zero-cost proxies in heterogeneous spaces. The work advances automated forecasting by enabling efficient, task-adaptive architecture search and provides insights into the diverse role of operation families in different forecasting contexts.

Abstract

The rapid development of time series forecasting research has brought many deep learning-based modules in this field. However, despite the increasing amount of new forecasting architectures, it is still unclear if we have leveraged the full potential of these existing modules within a properly designed architecture. In this work, we propose a novel hierarchical neural architecture search approach for time series forecasting tasks. With the design of a hierarchical search space, we incorporate many architecture types designed for forecasting tasks and allow for the efficient combination of different forecasting architecture modules. Results on long-term-time-series-forecasting tasks show that our approach can search for lightweight high-performing forecasting architectures across different forecasting tasks.
Paper Structure (35 sections, 3 equations, 20 figures, 15 tables)

This paper contains 35 sections, 3 equations, 20 figures, 15 tables.

Figures (20)

  • Figure 1: Operation level Search Space. The nodes $0$ and $1$ are input nodes that receive the network inputs or the outputs from the other cells. Node $4$ is the output node. Each colored edge represents an operation. The Search space (Left) is a fully connected directed acyclic graph. Once we have finished the search, we get the final architecture (Right).
  • Figure 2: Micro-level search space design for Flat and Seq cells.
  • Figure 3: Marco Search Space.
  • Figure 4: Random Search Evaluation Results on the ECL and Traffic datasets. (Left), MSE loss distributions on the ECL dataset. (Middle), MSE loss distributions on the Traffic dataset. (Right), loss distributions on the Traffic ECL dataset from the same architecture configurations
  • Figure 5: Spearman correlation between different ZC metrics and the evaluation test MSE losses
  • ...and 15 more figures