MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention

Carson Eisenach; Yagna Patel; Dhruv Madeka

MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention

Carson Eisenach, Yagna Patel, Dhruv Madeka

TL;DR

This work addresses the challenge of accurate multi-horizon probabilistic forecasting by introducing MQTransformer, which adds context-dependent horizon-specific decoder–encoder attention, learned position encodings from event indicators, and a decoder self-attention mechanism that leverages forecast feedback. The approach directly outputs quantiles, scales to large datasets via forking sequences, and demonstrates substantial improvements in both forecast accuracy and volatility reduction across large-scale and public benchmarks. Key findings include up to 33% gains in seasonal peak accuracy, large reductions in excess forecast volatility, and a 38% improvement over prior state-of-the-art on a retail forecasting dataset. The methods offer practical benefits for high-volume forecasting tasks in supply chain and related domains, with notable gains in throughput and prediction reliability.

Abstract

Recent advances in neural forecasting have produced major improvements in accuracy for probabilistic demand prediction. In this work, we propose novel improvements to the current state of the art by incorporating changes inspired by recent advances in Transformer architectures for Natural Language Processing. We develop a novel decoder-encoder attention for context-alignment, improving forecasting accuracy by allowing the network to study its own history based on the context for which it is producing a forecast. We also present a novel positional encoding that allows the neural network to learn context-dependent seasonality functions as well as arbitrary holiday distances. Finally we show that the current state of the art MQ-Forecaster (Wen et al., 2017) models display excess variability by failing to leverage previous errors in the forecast to improve accuracy. We propose a novel decoder-self attention scheme for forecasting that produces significant improvements in the excess variation of the forecast.

MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention

TL;DR

Abstract

MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)