DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting
Ruixin Ding, Yuqi Chen, Yu-Ting Lan, Wei Zhang
TL;DR
DRFormer tackles long-term time-series forecasting by removing the reliance on fixed patch lengths through a dynamic tokenizer with sparse learning to capture diverse receptive fields. It builds multi-scale representations via hierarchical pooling and a group-aware Transformer with gRoPE, followed by deconvolution-based fusion to predict future sequences. Empirical results on multiple real-world datasets show state-of-the-art performance, with consistent gains over both Transformer-based and non-Transformer baselines, and ablations confirm the contributions of dynamic modeling, multi-scale modeling, and advanced position encoding. The work offers a transferable framework for patch-based time-series modeling that reduces the need for expert patch-length selection and effectively captures cross-scale dependencies.
Abstract
Long-term time series forecasting (LTSF) has been widely applied in finance, traffic prediction, and other domains. Recently, patch-based transformers have emerged as a promising approach, segmenting data into sub-level patches that serve as input tokens. However, existing methods mostly rely on predetermined patch lengths, necessitating expert knowledge and posing challenges in capturing diverse characteristics across various scales. Moreover, time series data exhibit diverse variations and fluctuations across different temporal scales, which traditional approaches struggle to model effectively. In this paper, we propose a dynamic tokenizer with a dynamic sparse learning algorithm to capture diverse receptive fields and sparse patterns of time series data. In order to build hierarchical receptive fields, we develop a multi-scale Transformer model, coupled with multi-scale sequence extraction, capable of capturing multi-resolution features. Additionally, we introduce a group-aware rotary position encoding technique to enhance intra- and inter-group position awareness among representations across different temporal scales. Our proposed model, named DRFormer, is evaluated on various real-world datasets, and experimental results demonstrate its superiority compared to existing methods. Our code is available at: https://github.com/ruixindingECNU/DRFormer.
