CSformer: Combining Channel Independence and Mixing for Robust Multivariate Time Series Forecasting
Haoxin Wang, Yipeng Mo, Kunlan Xiang, Nan Yin, Honghe Dai, Bixiong Li, Songhai Fan, Site Mo
TL;DR
CSformer tackles multivariate time series forecasting by blending channel independence with channel mixing through a two-stage, shared-parameter attention mechanism and adapters. It uses a dimension-augmented embedding to expand sequence representation and applies channel- and sequence-MSA in a unified framework, enabling cross-dimension information fusion while maintaining efficiency. Empirical results across diverse datasets show state-of-the-art performance with strong generalization, supported by targeted ablations validating the two-stage MSA, adapters, and training strategy. This approach offers a practical, robust solution for real-world MTSF tasks and provides a new training paradigm emphasizing channel independence followed by mixing.
Abstract
In the domain of multivariate time series analysis, the concept of channel independence has been increasingly adopted, demonstrating excellent performance due to its ability to eliminate noise and the influence of irrelevant variables. However, such a concept often simplifies the complex interactions among channels, potentially leading to information loss. To address this challenge, we propose a strategy of channel independence followed by mixing. Based on this strategy, we introduce CSformer, a novel framework featuring a two-stage multiheaded self-attention mechanism. This mechanism is designed to extract and integrate both channel-specific and sequence-specific information. Distinctively, CSformer employs parameter sharing to enhance the cooperative effects between these two types of information. Moreover, our framework effectively incorporates sequence and channel adapters, significantly improving the model's ability to identify important information across various dimensions. Extensive experiments on several real-world datasets demonstrate that CSformer achieves state-of-the-art results in terms of overall performance.
