Table of Contents
Fetching ...

EffiCANet: Efficient Time Series Forecasting with Convolutional Attention

Xinxing Zhou, Jiaqi Ye, Shubao Zhao, Ming Jin, Chengyi Yang, Yanlong Wen, Xiaojie Yuan

TL;DR

Empirical evaluations show that EffiCANet achieves the maximum reduction of 10.02% in MAE over state-of-the-art models, while cutting computational costs by 26.2% relative to conventional large-kernel convolution methods, thanks to its efficient decomposition strategy.

Abstract

The exponential growth of multivariate time series data from sensor networks in domains like industrial monitoring and smart cities requires efficient and accurate forecasting models. Current deep learning methods often fail to adequately capture long-range dependencies and complex inter-variable relationships, especially under real-time processing constraints. These limitations arise as many models are optimized for either short-term forecasting with limited receptive fields or long-term accuracy at the cost of efficiency. Additionally, dynamic and intricate interactions between variables in real-world data further complicate modeling efforts. To address these limitations, we propose EffiCANet, an Efficient Convolutional Attention Network designed to enhance forecasting accuracy while maintaining computational efficiency. EffiCANet integrates three key components: (1) a Temporal Large-kernel Decomposed Convolution (TLDC) module that captures long-term temporal dependencies while reducing computational overhead; (2) an Inter-Variable Group Convolution (IVGC) module that captures complex and evolving relationships among variables; and (3) a Global Temporal-Variable Attention (GTVA) mechanism that prioritizes critical temporal and inter-variable features. Extensive evaluations across nine benchmark datasets show that EffiCANet achieves the maximum reduction of 10.02% in MAE over state-of-the-art models, while cutting computational costs by 26.2% relative to conventional large-kernel convolution methods, thanks to its efficient decomposition strategy.

EffiCANet: Efficient Time Series Forecasting with Convolutional Attention

TL;DR

Empirical evaluations show that EffiCANet achieves the maximum reduction of 10.02% in MAE over state-of-the-art models, while cutting computational costs by 26.2% relative to conventional large-kernel convolution methods, thanks to its efficient decomposition strategy.

Abstract

The exponential growth of multivariate time series data from sensor networks in domains like industrial monitoring and smart cities requires efficient and accurate forecasting models. Current deep learning methods often fail to adequately capture long-range dependencies and complex inter-variable relationships, especially under real-time processing constraints. These limitations arise as many models are optimized for either short-term forecasting with limited receptive fields or long-term accuracy at the cost of efficiency. Additionally, dynamic and intricate interactions between variables in real-world data further complicate modeling efforts. To address these limitations, we propose EffiCANet, an Efficient Convolutional Attention Network designed to enhance forecasting accuracy while maintaining computational efficiency. EffiCANet integrates three key components: (1) a Temporal Large-kernel Decomposed Convolution (TLDC) module that captures long-term temporal dependencies while reducing computational overhead; (2) an Inter-Variable Group Convolution (IVGC) module that captures complex and evolving relationships among variables; and (3) a Global Temporal-Variable Attention (GTVA) mechanism that prioritizes critical temporal and inter-variable features. Extensive evaluations across nine benchmark datasets show that EffiCANet achieves the maximum reduction of 10.02% in MAE over state-of-the-art models, while cutting computational costs by 26.2% relative to conventional large-kernel convolution methods, thanks to its efficient decomposition strategy.

Paper Structure

This paper contains 28 sections, 22 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: Illustration of asynchrony and lead-lag relationship in multiple time series: (a) Three variables exhibiting asynchronous behavior due to measurement errors, marked by vertical dashed lines. (b) A lead-lag relationship between two variables, with one leading the other by 10 timestamps.
  • Figure 2: Overview of the EffiCANet architecture. The input is first processed through a patching and embedding layer, transforming raw data into a suitable feature space. The core structure consists of $L$ stacked Blocks, each containing three key modules: TLDC, IVGC, and GTVA. Within each Block, the output is iteratively refined by element-wise multiplication with its respective input, enhancing feature representations. Finally, the output is passed through a prediction head to generate the forecast.
  • Figure 3: Illustration of the TLDC module. The input tensor, shaped as $(M, D, N)$, is processed through group convolution, segregating it into $M \times D$ groups corresponding to each variable-channel pair. Each group undergoes a decomposition of standard convolution into two stages: a depth-wise convolution (DW Conv) and a depth-wise dilated convolution (DW-D Conv). In this example, a DW Conv with a kernel size of 5 is followed by a DW-D Conv with the same kernel size and a dilation rate of 3, together simulating the effect of a large kernel size of 13. The combined outputs capture both short-term and long-term temporal dependencies effectively.
  • Figure 4: Illustration of the IVGC process. Initially, the data is padded to ensure divisibility by the time window size. Two padding strategies are applied: standard padding (top) and head-tail padding (bottom), creating distinct segmentations of the time dimension. Each group, represented by non-overlapping time windows, undergoes a convolution with a kernel size of 1 to capture inter-variable interactions within localized temporal patches. The outputs from both padding strategies are later aligned, merged, and further processed to produce the final integrated representation.
  • Figure 5: Illustration of the GTVA mechanism using Squeeze-and-Excitation (SE) blocks. Temporal and variable attention pathways separately generate attention weights, which are then multiplied with the convolution outputs to enhance feature learning in both dimensions.
  • ...and 5 more figures