Are Data Embeddings effective in time series forecasting?
Reza Nematirad, Anil Pahwa, Balasubramaniam Natarajan
TL;DR
The paper investigates the necessity of data embedding layers in time-series forecasting by performing large-scale ablations across fifteen high-performing models and four benchmark datasets, systematically removing embedding components to assess effects on $MSE$ and $MAE$ as well as training time and memory. It finds that embedding removal typically improves accuracy and efficiency, with larger gains at longer horizons, and that these improvements can exceed gains reported between competing models. The findings challenge the assumption that embedding layers are essential, highlighting the value of re-evaluating architectural components and encouraging broader validation across tasks and normalization strategies. Code is publicly available for reproduction and extension, underscoring a practical shift toward empirical component-level analysis in forecasting models.
Abstract
Time series forecasting plays a crucial role in many real-world applications, and numerous complex forecasting models have been proposed in recent years. Despite their architectural innovations, most state-of-the-art models report only marginal improvements -- typically just a few thousandths in standard error metrics. These models often incorporate complex data embedding layers to transform raw inputs into higher-dimensional representations to enhance accuracy. But are data embedding techniques actually effective in time series forecasting? Through extensive ablation studies across fifteen state-of-the-art models and four benchmark datasets, we find that removing data embedding layers from many state-of-the-art models does not degrade forecasting performance. In many cases, it improves both accuracy and computational efficiency. The gains from removing embedding layers often exceed the performance differences typically reported between competing models. Code available at: https://github.com/neuripsdataembedidng/DataEmbedding
