MultiResFormer: Transformer with Adaptive Multi-Resolution Modeling for General Time Series Forecasting
Linfeng Du, Ji Xin, Alex Labach, Saba Zuberi, Maksims Volkovs, Rahul G. Krishnan
TL;DR
The paper tackles the challenge of long-horizon time series forecasting where real-world data exhibit multiple coexisting periodicities. It introduces MultiResFormer, a Transformer with adaptive multi-resolution modeling that detects salient periodicities within each block and forms parallel patching branches at lengths aligned to these periods, enabling simultaneous interperiod and intraperiod modeling via a shared Transformer encoder. Key innovations include a periodicity-driven patching mechanism, an interpolation-based parameter-sharing strategy across resolutions, a resolution embedding scheme, and RevIN-based normalization to handle distribution shifts. Empirical results across long-term and short-term benchmarks show that MultiResFormer achieves state-of-the-art performance with substantially fewer parameters and competitive training efficiency, outperforming strong baselines like PatchTST and TimesNet on multiple datasets and horizons. The work contributes a practical, data-driven approach to adaptively fuse information across multiple time scales, with clear implications for applications in power, weather, transportation, and epidemiology where complex periodic patterns are prevalent.
Abstract
Transformer-based models have greatly pushed the boundaries of time series forecasting recently. Existing methods typically encode time series data into $\textit{patches}$ using one or a fixed set of patch lengths. This, however, could result in a lack of ability to capture the variety of intricate temporal dependencies present in real-world multi-periodic time series. In this paper, we propose MultiResFormer, which dynamically models temporal variations by adaptively choosing optimal patch lengths. Concretely, at the beginning of each layer, time series data is encoded into several parallel branches, each using a detected periodicity, before going through the transformer encoder block. We conduct extensive evaluations on long- and short-term forecasting datasets comparing MultiResFormer with state-of-the-art baselines. MultiResFormer outperforms patch-based Transformer baselines on long-term forecasting tasks and also consistently outperforms CNN baselines by a large margin, while using much fewer parameters than these baselines.
