Table of Contents
Fetching ...

AWEMixer: Adaptive Wavelet-Enhanced Mixer Network for Long-Term Time Series Forecasting

Qianyang Li, Xingjun Zhang, Peng Tao, Shaoxun Wang, Yancheng Pan, Jia Wei

TL;DR

AWEMixer tackles long-term forecasting for non-stationary, multi-scale IoT time series by integrating an adaptive wavelet-based frequency stream with a Mixer-style temporal backbone. The model introduces a Frequency Router to dynamically weight wavelet subbands and a Coherent Gated Fusion block to selectively fuse time–frequency features via cross-attention and gating, all within a dual-stream, cross-scale architecture that maintains linear complexity. Empirical results on seven benchmarks show state-of-the-art performance against Transformer and MLP-based baselines, with thorough ablations confirming the contributions of adaptive frequency weighting and gated fusion. The approach offers precise time–frequency localization, robustness to noise, and practical scalability for ultra-long horizons, with potential extensions to adaptive wavelets and anomaly detection.

Abstract

Forecasting long-term time series in IoT environments remains a significant challenge due to the non-stationary and multi-scale characteristics of sensor signals. Furthermore, error accumulation causes a decrease in forecast quality when predicting further into the future. Traditional methods are restricted to operate in time-domain, while the global frequency information achieved by Fourier transform would be regarded as stationary signals leading to blur the temporal patterns of transient events. We propose AWEMixer, an Adaptive Wavelet-Enhanced Mixer Network including two innovative components: 1) a Frequency Router designs to utilize the global periodicity pattern achieved by Fast Fourier Transform to adaptively weight localized wavelet subband, and 2) a Coherent Gated Fusion Block to achieve selective integration of prominent frequency features with multi-scale temporal representation through cross-attention and gating mechanism, which realizes accurate time-frequency localization while remaining robust to noise. Seven public benchmarks validate that our model is more effective than recent state-of-the-art models. Specifically, our model consistently achieves performance improvement compared with transformer-based and MLP-based state-of-the-art models in long-sequence time series forecasting. Code is available at https://github.com/hit636/AWEMixer

AWEMixer: Adaptive Wavelet-Enhanced Mixer Network for Long-Term Time Series Forecasting

TL;DR

AWEMixer tackles long-term forecasting for non-stationary, multi-scale IoT time series by integrating an adaptive wavelet-based frequency stream with a Mixer-style temporal backbone. The model introduces a Frequency Router to dynamically weight wavelet subbands and a Coherent Gated Fusion block to selectively fuse time–frequency features via cross-attention and gating, all within a dual-stream, cross-scale architecture that maintains linear complexity. Empirical results on seven benchmarks show state-of-the-art performance against Transformer and MLP-based baselines, with thorough ablations confirming the contributions of adaptive frequency weighting and gated fusion. The approach offers precise time–frequency localization, robustness to noise, and practical scalability for ultra-long horizons, with potential extensions to adaptive wavelets and anomaly detection.

Abstract

Forecasting long-term time series in IoT environments remains a significant challenge due to the non-stationary and multi-scale characteristics of sensor signals. Furthermore, error accumulation causes a decrease in forecast quality when predicting further into the future. Traditional methods are restricted to operate in time-domain, while the global frequency information achieved by Fourier transform would be regarded as stationary signals leading to blur the temporal patterns of transient events. We propose AWEMixer, an Adaptive Wavelet-Enhanced Mixer Network including two innovative components: 1) a Frequency Router designs to utilize the global periodicity pattern achieved by Fast Fourier Transform to adaptively weight localized wavelet subband, and 2) a Coherent Gated Fusion Block to achieve selective integration of prominent frequency features with multi-scale temporal representation through cross-attention and gating mechanism, which realizes accurate time-frequency localization while remaining robust to noise. Seven public benchmarks validate that our model is more effective than recent state-of-the-art models. Specifically, our model consistently achieves performance improvement compared with transformer-based and MLP-based state-of-the-art models in long-sequence time series forecasting. Code is available at https://github.com/hit636/AWEMixer

Paper Structure

This paper contains 36 sections, 14 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Conceptual illustration. (a) A non-stationary IoT signal with a transient high-frequency event. (b) The Fourier Transform captures the global frequencies but loses the temporal location of the event. (c) The Wavelet Transform localizes the event in both time and frequency, demonstrating its suitability for non-stationary signal analysis.
  • Figure 2: The overall architecture of AWEMixer. The model consists of a dual-stream backbone for feature extraction. (a) The temporal stream uses hierarchical average pooling to generate a multi-scale representation, capturing features from fine-grained fluctuations to long-term trends. (b) The frequency stream applies a Discrete Wavelet Transform and interpolation to create a parallel multi-band representation. An innovative (c) Frequency Router dynamically weights the wavelet bands based on the input's global periodicity. These adaptive frequency features are then selectively merged with each temporal scale in the (d) Coherent Gated Fusion block, followed by cross-scale mixing before the final prediction.
  • Figure 3: Case study comparing the forecasting performance of AWEMixer against baseline models on the ETTm2 and Weather datasets. Each plot displays the ground truth (orange) and a model's prediction (blue). The top row ((a)-(d)) showcases performance on the periodic ETTm2 dataset with a 720-step forecast horizon. The bottom row ((e)-(h)) illustrates robustness on the volatile Weather dataset with a 336-step forecast horizon.
  • Figure 4: Sensitivity analysis of the number of Gated Fusion layers ($N$). The plots show results on (a) the ETTh2 dataset and (b) the Weather dataset. Performance generally improves as $N$ increases from 1 to 3, with $N=3$ offering the best balance.
  • Figure 5: Sensitivity analysis of the wavelet decomposition level $J$. Results are shown for (a) the ETTh2 dataset and (b) the Weather dataset. A level of 3 or 4 offers the best balance between decomposition granularity and preservation of trend information.