Table of Contents
Fetching ...

Revitalizing Canonical Pre-Alignment for Irregular Multivariate Time Series Forecasting

Ziyu Zhou, Yiming Huang, Yanyun Wang, Yuankai Wu, James Kwok, Yuxuan Liang

TL;DR

This work tackles irregular multivariate time series forecasting by reintroducing Canonical Pre-Alignment (CPA) with efficient handling of inflated sequence length. It introduces KAFNet, integrating a Pre-Convolution for smoothing, Temporal Kernel Aggregation to compress CPA-aligned sequences, and Frequency Linear Attention to capture global inter-variate correlations in the frequency domain. The approach achieves state-of-the-art accuracy on four IMTS benchmarks while reducing parameters by about 7.2x and speeding up training/inference by about 8.4x compared with leading graph-based baselines. The results demonstrate that CPA, when paired with targeted compression and efficient attention, can surpass bypass strategies that sacrifice global inter-variate modeling. The work also points to future extensions to other IMTS tasks and deployment-scale evaluations in real-world domains.

Abstract

Irregular multivariate time series (IMTS), characterized by uneven sampling and inter-variate asynchrony, fuel many forecasting applications yet remain challenging to model efficiently. Canonical Pre-Alignment (CPA) has been widely adopted in IMTS modeling by padding zeros at every global timestamp, thereby alleviating inter-variate asynchrony and unifying the series length, but its dense zero-padding inflates the pre-aligned series length, especially when numerous variates are present, causing prohibitive compute overhead. Recent graph-based models with patching strategies sidestep CPA, but their local message passing struggles to capture global inter-variate correlations. Therefore, we posit that CPA should be retained, with the pre-aligned series properly handled by the model, enabling it to outperform state-of-the-art graph-based baselines that sidestep CPA. Technically, we propose KAFNet, a compact architecture grounded in CPA for IMTS forecasting that couples (1) Pre-Convolution module for sequence smoothing and sparsity mitigation, (2) Temporal Kernel Aggregation module for learnable compression and modeling of intra-series irregularity, and (3) Frequency Linear Attention blocks for the low-cost inter-series correlations modeling in the frequency domain. Experiments on multiple IMTS datasets show that KAFNet achieves state-of-the-art forecasting performance, with a 7.2$\times$ parameter reduction and a 8.4$\times$ training-inference acceleration.

Revitalizing Canonical Pre-Alignment for Irregular Multivariate Time Series Forecasting

TL;DR

This work tackles irregular multivariate time series forecasting by reintroducing Canonical Pre-Alignment (CPA) with efficient handling of inflated sequence length. It introduces KAFNet, integrating a Pre-Convolution for smoothing, Temporal Kernel Aggregation to compress CPA-aligned sequences, and Frequency Linear Attention to capture global inter-variate correlations in the frequency domain. The approach achieves state-of-the-art accuracy on four IMTS benchmarks while reducing parameters by about 7.2x and speeding up training/inference by about 8.4x compared with leading graph-based baselines. The results demonstrate that CPA, when paired with targeted compression and efficient attention, can surpass bypass strategies that sacrifice global inter-variate modeling. The work also points to future extensions to other IMTS tasks and deployment-scale evaluations in real-world domains.

Abstract

Irregular multivariate time series (IMTS), characterized by uneven sampling and inter-variate asynchrony, fuel many forecasting applications yet remain challenging to model efficiently. Canonical Pre-Alignment (CPA) has been widely adopted in IMTS modeling by padding zeros at every global timestamp, thereby alleviating inter-variate asynchrony and unifying the series length, but its dense zero-padding inflates the pre-aligned series length, especially when numerous variates are present, causing prohibitive compute overhead. Recent graph-based models with patching strategies sidestep CPA, but their local message passing struggles to capture global inter-variate correlations. Therefore, we posit that CPA should be retained, with the pre-aligned series properly handled by the model, enabling it to outperform state-of-the-art graph-based baselines that sidestep CPA. Technically, we propose KAFNet, a compact architecture grounded in CPA for IMTS forecasting that couples (1) Pre-Convolution module for sequence smoothing and sparsity mitigation, (2) Temporal Kernel Aggregation module for learnable compression and modeling of intra-series irregularity, and (3) Frequency Linear Attention blocks for the low-cost inter-series correlations modeling in the frequency domain. Experiments on multiple IMTS datasets show that KAFNet achieves state-of-the-art forecasting performance, with a 7.2 parameter reduction and a 8.4 training-inference acceleration.

Paper Structure

This paper contains 26 sections, 13 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Illustration of Canonical Pre‑Alignment (CPA).
  • Figure 2: KAFNet delivers superior predictive accuracy (MAE) and efficiency (average) on four IMTS datasets.
  • Figure 3: The main architecture of KAFNet. The input IMTS is initially processed by CPA and fed into the Pre‑Convolution module ($n\in[1,N]$) for sequence smoothing, then passed through the Temporal Kernel Aggregation module for intra‑series irregularity modeling and through the Frequency Linear Attention blocks for the inter‑series correlations modeling. Finally, the Output Layer generates the query-specific forecasts. Linear Attn: linear attention mechanism, MLP: multi-layer perceptron.
  • Figure 4: Sensitivity of MSE to (a) the number of Gaussian kernels in TKA and (b) the hidden state dimension in Eq. \ref{['eq:tka_proj']}, FLA and the Output Layer, on two IMTS datasets.
  • Figure 5: Comparison of the number of parameters (K), FLOPs (B), average training time per batch per epoch (s), and total inference time (s) of KAFNet and four strong baselines for IMTS forecasting. All statistics are collected on the MIMIC dataset with a batch size of 32 to ensure a fair comparison. Lower values indicate higher efficiency.
  • ...and 1 more figures