Inter-Series Transformer: Attending to Products in Time Series Forecasting
Rares Cristian, Pavithra Harsha, Clemente Ocejo, Georgia Perakis, Brian Quanz, Ioannis Spantidakis, Hamza Zerhouni
TL;DR
The paper tackles the challenge of forecasting in supply-chain contexts, where sparsity and cross-series effects hinder traditional and standard Transformer models. It proposes the Inter-Series Transformer, which first applies a cross-series attention layer to inform the target time series and then passes a shared, multi-task per-series Transformer, enabling both cross-series interactions and per-series temporal modeling. Empirical results on a private medical-device dataset and two Walmart retail datasets show the approach often outperforms baselines and competitive state-of-the-art Transformer forecasts, with ablations highlighting the value of high-dimensional feature projections and the omission of positional encoding in favor of explicit date features. The work advances practical demand forecasting by addressing sparsity, overfitting, and cross-series effects, with interpretable attention patterns and robust cross-validation analyses supporting its applicability and potential impact in real-world supply chains.
Abstract
Time series forecasting is an important task in many fields ranging from supply chain management to weather forecasting. Recently, Transformer neural network architectures have shown promising results in forecasting on common time series benchmark datasets. However, application to supply chain demand forecasting, which can have challenging characteristics such as sparsity and cross-series effects, has been limited. In this work, we explore the application of Transformer-based models to supply chain demand forecasting. In particular, we develop a new Transformer-based forecasting approach using a shared, multi-task per-time series network with an initial component applying attention across time series, to capture interactions and help address sparsity. We provide a case study applying our approach to successfully improve demand prediction for a medical device manufacturing company. To further validate our approach, we also apply it to public demand forecasting datasets as well and demonstrate competitive to superior performance compared to a variety of baseline and state-of-the-art forecast methods across the private and public datasets.
