Boosting MLPs with a Coarsening Strategy for Long-Term Time Series Forecasting
Nannan Bian, Minhong Zhu, Li Chen, Weiran Cai
TL;DR
CP-Net introduces a two-stage coarsening scheme to boost MLP-based long-term time series forecasting. By forming information granules through a Token Projection Block and a Contextual Sampling Block, and then merging multiple temporal scales, CP-Net preserves crucial temporal correlations while filtering noise, all with linear computational complexity. Empirical results on seven datasets show CP-Net achieving state-of-the-art or competitive performance, including notable gains over Transformer- and CNN-based baselines and faster training/inference than attention-based models. The approach demonstrates that convolutional boosting of MLPs with multi-scale coarsening can effectively model both local and global temporal patterns in multivariate time series.
Abstract
Deep learning methods have been exerting their strengths in long-term time series forecasting. However, they often struggle to strike a balance between expressive power and computational efficiency. Resorting to multi-layer perceptrons (MLPs) provides a compromising solution, yet they suffer from two critical problems caused by the intrinsic point-wise mapping mode, in terms of deficient contextual dependencies and inadequate information bottleneck. Here, we propose the Coarsened Perceptron Network (CP-Net), featured by a coarsening strategy that alleviates the above problems associated with the prototype MLPs by forming information granules in place of solitary temporal points. The CP-Net utilizes primarily a two-stage framework for extracting semantic and contextual patterns, which preserves correlations over larger timespans and filters out volatile noises. This is further enhanced by a multi-scale setting, where patterns of diverse granularities are fused towards a comprehensive prediction. Based purely on convolutions of structural simplicity, CP-Net is able to maintain a linear computational complexity and low runtime, while demonstrates an improvement of 4.1% compared with the SOTA method on seven forecasting benchmarks.
