Table of Contents
Fetching ...

MCformer: Multivariate Time Series Forecasting with Mixed-Channels Transformer

Wenyong Han, Tao Zhu Member, Liming Chen, Huansheng Ning, Yang Luo, Yaping Wan

TL;DR

This paper tackles multivariate time-series forecasting in IoT settings by addressing inter-channel forgetting that can arise when channels are treated independently. It introduces Mixed-Channels, a strategy that preserves the dataset-expanding benefits of Channel Independence while enabling inter-channel dependency learning through selective channel mixing. Built on a Transformer encoder, MCformer employs a Mixed-Channels Block and Patch-based projections, combined with Reversible Instance Normalization, to model long-term and cross-channel features. Across five real-world datasets, MCformer achieves state-of-the-art performance, with ablations confirming robust gains as the number of mixed channels increases and correlation dynamics are effectively captured, highlighting its practical impact for scalable, accurate multi-channel forecasting.

Abstract

The massive generation of time-series data by largescale Internet of Things (IoT) devices necessitates the exploration of more effective models for multivariate time-series forecasting. In previous models, there was a predominant use of the Channel Dependence (CD) strategy (where each channel represents a univariate sequence). Current state-of-the-art (SOTA) models primarily rely on the Channel Independence (CI) strategy. The CI strategy treats all channels as a single channel, expanding the dataset to improve generalization performance and avoiding inter-channel correlation that disrupts long-term features. However, the CI strategy faces the challenge of interchannel correlation forgetting. To address this issue, we propose an innovative Mixed Channels strategy, combining the data expansion advantages of the CI strategy with the ability to counteract inter-channel correlation forgetting. Based on this strategy, we introduce MCformer, a multivariate time-series forecasting model with mixed channel features. The model blends a specific number of channels, leveraging an attention mechanism to effectively capture inter-channel correlation information when modeling long-term features. Experimental results demonstrate that the Mixed Channels strategy outperforms pure CI strategy in multivariate time-series forecasting tasks.

MCformer: Multivariate Time Series Forecasting with Mixed-Channels Transformer

TL;DR

This paper tackles multivariate time-series forecasting in IoT settings by addressing inter-channel forgetting that can arise when channels are treated independently. It introduces Mixed-Channels, a strategy that preserves the dataset-expanding benefits of Channel Independence while enabling inter-channel dependency learning through selective channel mixing. Built on a Transformer encoder, MCformer employs a Mixed-Channels Block and Patch-based projections, combined with Reversible Instance Normalization, to model long-term and cross-channel features. Across five real-world datasets, MCformer achieves state-of-the-art performance, with ablations confirming robust gains as the number of mixed channels increases and correlation dynamics are effectively captured, highlighting its practical impact for scalable, accurate multi-channel forecasting.

Abstract

The massive generation of time-series data by largescale Internet of Things (IoT) devices necessitates the exploration of more effective models for multivariate time-series forecasting. In previous models, there was a predominant use of the Channel Dependence (CD) strategy (where each channel represents a univariate sequence). Current state-of-the-art (SOTA) models primarily rely on the Channel Independence (CI) strategy. The CI strategy treats all channels as a single channel, expanding the dataset to improve generalization performance and avoiding inter-channel correlation that disrupts long-term features. However, the CI strategy faces the challenge of interchannel correlation forgetting. To address this issue, we propose an innovative Mixed Channels strategy, combining the data expansion advantages of the CI strategy with the ability to counteract inter-channel correlation forgetting. Based on this strategy, we introduce MCformer, a multivariate time-series forecasting model with mixed channel features. The model blends a specific number of channels, leveraging an attention mechanism to effectively capture inter-channel correlation information when modeling long-term features. Experimental results demonstrate that the Mixed Channels strategy outperforms pure CI strategy in multivariate time-series forecasting tasks.
Paper Structure (16 sections, 6 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 6 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: The difference between the CI, CD and Mixed Channels strategies.
  • Figure 2: Overview of the Mixed Channels method: Multivariate time series data is initially decomposed by channel, resulting in individual channel data. Subsequently, based on the channel interval size, data from different channels is mixed. The mixed data will share parameters in the Transformer Encoder.
  • Figure 3: Mixed Channels architecture: In the Mixed-Channels Block, we decompose multivariate time series data into single channels and then blend the data from different channels. The blended data is then segmented into multiple patches, with each patch composed of adjacent samples. These patches are transformed into input tokens through a projection process.
  • Figure 4: Application of Patch to temporal models. Using patches significantly extends the historical time range of the input while maintaining the same token length.
  • Figure 5: Accuracy improvement versus number of features. We computed the average MSE improvement of MCformer compared to the single-channel strategies TiDE and PatchTST across different channel numbers. The results indicate that as the number of channels increases, the performance improvement of MCformer gradually becomes more significant. This suggests that MCformer can effectively capture dependencies between Multivariate data, thereby enhancing predictive performance.
  • ...and 2 more figures