Mitigating Data Redundancy to Revitalize Transformer-based Long-Term Time Series Forecasting System
Mingjie Li, Rui Liu, Guangsi Shi, Mingfei Han, Changling Li, Lina Yao, Xiaojun Chang, Ling Chen
TL;DR
This work tackles data redundancy in rolling-window long-term time-series forecasting by introducing CLMFormer, a Transformer-based framework that combines curriculum-learning-driven noise with a memory-driven decoder to diversify training samples. The method deploys a progressive dropout schedule and a seasonal memory module, including Memory-driven Conditional Layer Normalization and a Seasonal Memory Matrix, to enhance pattern recognition and capture seasonality in highly similar data. Extensive experiments on six real-world benchmarks show up to 30% improvements over strong Transformer baselines, with pronounced gains for longer prediction horizons and when integrated with state-of-the-art models like FEDformer. The approach is demonstrated to be broadly compatible with existing LTSF architectures and offers a practical path toward more robust, long-range forecasts in domains with limited diverse training data.
Abstract
Long-term time-series forecasting (LTSF) is fundamental to various real-world applications, where Transformer-based models have become the dominant framework due to their ability to capture long-range dependencies. However, these models often experience overfitting due to data redundancy in rolling forecasting settings, limiting their generalization ability particularly evident in longer sequences with highly similar adjacent data. In this work, we introduce CLMFormer, a novel framework that mitigates redundancy through curriculum learning and a memory-driven decoder. Specifically, we progressively introduce Bernoulli noise to the training samples, which effectively breaks the high similarity between adjacent data points. This curriculum-driven noise introduction aids the memory-driven decoder by supplying more diverse and representative training data, enhancing the decoder's ability to model seasonal tendencies and dependencies in the time-series data. To further enhance forecasting accuracy, we introduce a memory-driven decoder. This component enables the model to capture seasonal tendencies and dependencies in the time-series data and leverages temporal relationships to facilitate the forecasting process. Extensive experiments on six real-world LTSF benchmarks show that CLMFormer consistently improves Transformer-based models by up to 30%, demonstrating its effectiveness in long-horizon forecasting.
