Table of Contents
Fetching ...

Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness

Jinkwan Jang, Hyungjin Park, Jinmyeong Choi, Taesup Kim

TL;DR

ChannelTokenFormer is proposed, a Transformer-based forecasting framework with a flexible architecture designed to explicitly capture cross-channel interactions, accommodate channel-wise asynchronous sampling, and effectively handle missing values.

Abstract

Real-world time series data are inherently multivariate, often exhibiting complex inter-channel dependencies. Each channel is typically sampled at its own period and is prone to missing values due to various practical and operational constraints. These characteristics pose three fundamental challenges involving channel dependency, sampling asynchrony, and missingness, all of which must be addressed simultaneously to enable robust and reliable forecasting in practical settings. However, existing architectures typically address only parts of these challenges in isolation and still rely on simplifying assumptions, leaving unresolved the combined challenges of asynchronous channel sampling, test-time missing blocks, and intricate inter-channel dependencies. To bridge this gap, we propose ChannelTokenFormer, a Transformer-based forecasting framework with a flexible architecture designed to explicitly capture cross-channel interactions, accommodate channel-wise asynchronous sampling, and effectively handle missing values. Extensive experiments on public benchmark datasets reflecting practical settings, along with one private real-world industrial dataset, demonstrate the superior robustness and accuracy of ChannelTokenFormer under challenging real-world conditions.

Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness

TL;DR

ChannelTokenFormer is proposed, a Transformer-based forecasting framework with a flexible architecture designed to explicitly capture cross-channel interactions, accommodate channel-wise asynchronous sampling, and effectively handle missing values.

Abstract

Real-world time series data are inherently multivariate, often exhibiting complex inter-channel dependencies. Each channel is typically sampled at its own period and is prone to missing values due to various practical and operational constraints. These characteristics pose three fundamental challenges involving channel dependency, sampling asynchrony, and missingness, all of which must be addressed simultaneously to enable robust and reliable forecasting in practical settings. However, existing architectures typically address only parts of these challenges in isolation and still rely on simplifying assumptions, leaving unresolved the combined challenges of asynchronous channel sampling, test-time missing blocks, and intricate inter-channel dependencies. To bridge this gap, we propose ChannelTokenFormer, a Transformer-based forecasting framework with a flexible architecture designed to explicitly capture cross-channel interactions, accommodate channel-wise asynchronous sampling, and effectively handle missing values. Extensive experiments on public benchmark datasets reflecting practical settings, along with one private real-world industrial dataset, demonstrate the superior robustness and accuracy of ChannelTokenFormer under challenging real-world conditions.

Paper Structure

This paper contains 63 sections, 8 equations, 13 figures, 20 tables, 1 algorithm.

Figures (13)

  • Figure 1: Our proposed practical conditions highlight the simultaneous presence of channel-wise asynchronous sampling, block-wise missingness at test time, and inter-channel dependencies. Interpolation over coarsely sampled regions leads to signal distortion. See Appendix \ref{['b:distortion']} for more details.
  • Figure 2: Overview of ChannelTokenFormer (CTF). All tokens across channels pass through a unified attention layer, where local and global information is aggregated into channel tokens. Only the channel tokens are decoded by decoders, each shared among channels with the same sampling period, to produce the final prediction.
  • Figure 3: Our unified attention masking strategy. Local tokens perform intra-temporal attention within the same channel. Channel tokens aggregate local and cross-channel information, but are not visible to local tokens and do not attend to themselves. Optionally, attention among channel tokens from the same channel can be masked to encourage inter-channel interaction and reduce redundancy.
  • Figure 4: Frequency-domain comparison between CTF and TimeXer on a test sample from ETT1, showing that amplitude attenuation is prevented across all frequency bands.
  • Figure 5: Overview of the LNG Cargo Handling System (CHS).
  • ...and 8 more figures