Table of Contents
Fetching ...

Neural Architecture Search for global multi-step Forecasting of Energy Production Time Series

Georg Velev, Stefan Lessmann

TL;DR

The paper tackles the challenge of accurate, efficient global multi-step forecasting for energy production time series under temporal generalization constraints. It proposes a macro NAS framework driven by an actor-critic reinforcement learning controller and a novel WV-GED reward that promotes temporal generalization and exploration of diverse architectures. An ensemble of lightweight NAS-discovered models, notably MTSMixer-based blocks, achieves superior predictive accuracy and runtime efficiency versus Transformers and pre-trained forecasting models across horizons $48$, $96$, and $192$, with demonstrated transferability to renewable and fossil-fuel data. This work highlights the importance of search-space design and reward shaping for reliable, deployable energy forecasting in real-world grid and market contexts.

Abstract

The dynamic energy sector requires both predictive accuracy and runtime efficiency for short-term forecasting of energy generation under operational constraints, where timely and precise predictions are crucial. The manual configuration of complex methods, which can generate accurate global multi-step predictions without suffering from a computational bottleneck, represents a procedure with significant time requirements and high risk for human-made errors. A further intricacy arises from the temporal dynamics present in energy-related data. Additionally, the generalization to unseen data is imperative for continuously deploying forecasting techniques over time. To overcome these challenges, in this research, we design a neural architecture search (NAS)-based framework for the automated discovery of time series models that strike a balance between computational efficiency, predictive performance, and generalization power for the global, multi-step short-term forecasting of energy production time series. In particular, we introduce a search space consisting only of efficient components, which can capture distinctive patterns of energy time series. Furthermore, we formulate a novel objective function that accounts for performance generalization in temporal context and the maximal exploration of different regions of our high-dimensional search space. The results obtained on energy production time series show that an ensemble of lightweight architectures discovered with NAS outperforms state-of-the-art techniques, such as Transformers, as well as pre-trained forecasting models, in terms of both efficiency and accuracy.

Neural Architecture Search for global multi-step Forecasting of Energy Production Time Series

TL;DR

The paper tackles the challenge of accurate, efficient global multi-step forecasting for energy production time series under temporal generalization constraints. It proposes a macro NAS framework driven by an actor-critic reinforcement learning controller and a novel WV-GED reward that promotes temporal generalization and exploration of diverse architectures. An ensemble of lightweight NAS-discovered models, notably MTSMixer-based blocks, achieves superior predictive accuracy and runtime efficiency versus Transformers and pre-trained forecasting models across horizons , , and , with demonstrated transferability to renewable and fossil-fuel data. This work highlights the importance of search-space design and reward shaping for reliable, deployable energy forecasting in real-world grid and market contexts.

Abstract

The dynamic energy sector requires both predictive accuracy and runtime efficiency for short-term forecasting of energy generation under operational constraints, where timely and precise predictions are crucial. The manual configuration of complex methods, which can generate accurate global multi-step predictions without suffering from a computational bottleneck, represents a procedure with significant time requirements and high risk for human-made errors. A further intricacy arises from the temporal dynamics present in energy-related data. Additionally, the generalization to unseen data is imperative for continuously deploying forecasting techniques over time. To overcome these challenges, in this research, we design a neural architecture search (NAS)-based framework for the automated discovery of time series models that strike a balance between computational efficiency, predictive performance, and generalization power for the global, multi-step short-term forecasting of energy production time series. In particular, we introduce a search space consisting only of efficient components, which can capture distinctive patterns of energy time series. Furthermore, we formulate a novel objective function that accounts for performance generalization in temporal context and the maximal exploration of different regions of our high-dimensional search space. The results obtained on energy production time series show that an ensemble of lightweight architectures discovered with NAS outperforms state-of-the-art techniques, such as Transformers, as well as pre-trained forecasting models, in terms of both efficiency and accuracy.

Paper Structure

This paper contains 16 sections, 12 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Architecture of the agent and information flow within the actor-critic.
  • Figure 2: Example for total energy production time series from Belgium, Germany, and Poland.
  • Figure 3: NAS training details related to the controller networks trained with our novel reward signal and the entropy-based reward, a) the rewards achieved per episode on average, i.e., the sum of the inverse of RMSE errors obtained on both validation subsets, b) the average NAS model probability per episode, c) and d) the average GED and the normalized entropy of the NAS models per episode.
  • Figure 4: Two-dimensional representation of the search space regions visited by both controllers during NAS. The coloring of the scatterplot is related to the rewards, i.e., the inverse of the RMSE scores, achieved by the sampled architectures on both validation subsets.
  • Figure 5: Heatmaps visualizing the the Pearson correlation between the true values and the target predictions per time step obtained from the best performing WV-GED-based and entropy-based models. The correlation values are averaged across both time series datasets from the three forecasting horizon settings $\{\ 48, 96, 192\}\ $.
  • ...and 3 more figures