Table of Contents
Fetching ...

Simple Feedfoward Neural Networks are Almost All You Need for Time Series Forecasting

Fan-Keng Sun, Yu-Cheng Wu, Duane S. Boning

TL;DR

This work argues that simple feedforward neural networks, when equipped with a few carefully chosen enhancements, can rival state-of-the-art time series forecasting models such as Transformers. The proposed SFNN architecture uses a shared univariate core across series, with optional modules for input mean centering, series-wise non-linear mapping, and layer normalization to boost performance. Through extensive experiments and a rigorous ablation study, the authors show that SFNNs achieve state-of-the-art results on many datasets, justify their robustness by longer-look-back gains, and reveal limitations on certain domains like Traffic. They also critique current benchmarking practices and propose a fair evaluation protocol, establishing SFNNs as a strong baseline that future work should rigorously compare against.

Abstract

Time series data are everywhere -- from finance to healthcare -- and each domain brings its own unique complexities and structures. While advanced models like Transformers and graph neural networks (GNNs) have gained popularity in time series forecasting, largely due to their success in tasks like language modeling, their added complexity is not always necessary. In our work, we show that simple feedforward neural networks (SFNNs) can achieve performance on par with, or even exceeding, these state-of-the-art models, while being simpler, smaller, faster, and more robust. Our analysis indicates that, in many cases, univariate SFNNs are sufficient, implying that modeling interactions between multiple series may offer only marginal benefits. Even when inter-series relationships are strong, a basic multivariate SFNN still delivers competitive results. We also examine some key design choices and offer guidelines on making informed decisions. Additionally, we critique existing benchmarking practices and propose an improved evaluation protocol. Although SFNNs may not be optimal for every situation (hence the ``almost'' in our title) they serve as a strong baseline that future time series forecasting methods should always be compared against.

Simple Feedfoward Neural Networks are Almost All You Need for Time Series Forecasting

TL;DR

This work argues that simple feedforward neural networks, when equipped with a few carefully chosen enhancements, can rival state-of-the-art time series forecasting models such as Transformers. The proposed SFNN architecture uses a shared univariate core across series, with optional modules for input mean centering, series-wise non-linear mapping, and layer normalization to boost performance. Through extensive experiments and a rigorous ablation study, the authors show that SFNNs achieve state-of-the-art results on many datasets, justify their robustness by longer-look-back gains, and reveal limitations on certain domains like Traffic. They also critique current benchmarking practices and propose a fair evaluation protocol, establishing SFNNs as a strong baseline that future work should rigorously compare against.

Abstract

Time series data are everywhere -- from finance to healthcare -- and each domain brings its own unique complexities and structures. While advanced models like Transformers and graph neural networks (GNNs) have gained popularity in time series forecasting, largely due to their success in tasks like language modeling, their added complexity is not always necessary. In our work, we show that simple feedforward neural networks (SFNNs) can achieve performance on par with, or even exceeding, these state-of-the-art models, while being simpler, smaller, faster, and more robust. Our analysis indicates that, in many cases, univariate SFNNs are sufficient, implying that modeling interactions between multiple series may offer only marginal benefits. Even when inter-series relationships are strong, a basic multivariate SFNN still delivers competitive results. We also examine some key design choices and offer guidelines on making informed decisions. Additionally, we critique existing benchmarking practices and propose an improved evaluation protocol. Although SFNNs may not be optimal for every situation (hence the ``almost'' in our title) they serve as a strong baseline that future time series forecasting methods should always be compared against.

Paper Structure

This paper contains 22 sections, 3 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: The SFNN architecture. The core block consists of linear mappings with ReLU activations and residual connections. Optional modules are dotted and include input mean centering, series-wise mapping with SELU activations, and layer normalization.
  • Figure 2: The relative run times, MSEs, and parameter counts of DUET duet and iTransformer liu2023itransformer compared to SFNNs on various datasets. For each dataset, horizon 96 and 720 are compared.
  • Figure 3: The percentage improvement by applying mean centering across all datasets and horizons, along with a linear regression line and its $95\%$ confidence interval.
  • Figure 4: Percentage improvement from incorporating series-wise mapping, along with a linear regression line and its $95\%$ confidence interval. The left panel presents results for datasets with a large number of series, while the right panel shows those for datasets with a small number of series.
  • Figure 5: Johansen cointegration test results for all datasets, with the right panel offering a zoomed-in view. The horizontal axis represents the number of lags used in the test, while the vertical axis shows the test statistic for cointegration rank of $r = N-1$. The dotted horizontal lines indicate the critical values for rejecting the null hypothesis with confidence of $95\%$ of cointegration rank of $r = N-1$, which implies that all $N$ series are cointegrated.
  • ...and 2 more figures