Table of Contents
Fetching ...

MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting

Runze Yang, Longbing Cao, Xiaoming Wu, Xin You, Kun Fang, Jianxun Li, Jie Yang

Abstract

Separating multiple effects in time series is fundamental yet challenging for time-series forecasting (TSF). However, existing TSF models cannot effectively learn interpretable multi-effect decomposition by their smoothing-based temporal techniques. Here, a new interpretable frequency-based decomposition pipeline MLOW captures the insight: a time series can be represented as a magnitude spectrum multiplied by the corresponding phase-aware basis functions, and the magnitude spectrum distribution of a time series always exhibits observable patterns for different effects. MLOW learns a low-rank representation of the magnitude spectrum to capture dominant trending and seasonal effects. We explore low-rank methods, including PCA, NMF, and Semi-NMF, and find that none can simultaneously achieve interpretable, efficient and generalizable decomposition. Thus, we propose hyperplane-nonnegative matrix factorization (Hyperplane-NMF). Further, to address the frequency (spectral) leakage restricting high-quality low-rank decomposition, MLOW enables a flexible selection of input horizons and frequency levels via a mathematical mechanism. Visual analysis demonstrates that MLOW enables interpretable and hierarchical multiple-effect decomposition, robust to noises. It can also enable plug-and-play in existing TSF backbones with remarkable performance improvement but minimal architectural modifications.

MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting

Abstract

Separating multiple effects in time series is fundamental yet challenging for time-series forecasting (TSF). However, existing TSF models cannot effectively learn interpretable multi-effect decomposition by their smoothing-based temporal techniques. Here, a new interpretable frequency-based decomposition pipeline MLOW captures the insight: a time series can be represented as a magnitude spectrum multiplied by the corresponding phase-aware basis functions, and the magnitude spectrum distribution of a time series always exhibits observable patterns for different effects. MLOW learns a low-rank representation of the magnitude spectrum to capture dominant trending and seasonal effects. We explore low-rank methods, including PCA, NMF, and Semi-NMF, and find that none can simultaneously achieve interpretable, efficient and generalizable decomposition. Thus, we propose hyperplane-nonnegative matrix factorization (Hyperplane-NMF). Further, to address the frequency (spectral) leakage restricting high-quality low-rank decomposition, MLOW enables a flexible selection of input horizons and frequency levels via a mathematical mechanism. Visual analysis demonstrates that MLOW enables interpretable and hierarchical multiple-effect decomposition, robust to noises. It can also enable plug-and-play in existing TSF backbones with remarkable performance improvement but minimal architectural modifications.
Paper Structure (19 sections, 7 equations, 28 figures, 12 tables, 1 algorithm)

This paper contains 19 sections, 7 equations, 28 figures, 12 tables, 1 algorithm.

Figures (28)

  • Figure 1: The Motivation and Visualization for MLOW
  • Figure 2: Inference pipeline for MLOW. The original time series is decomposed into $V$ components and residual. A larger window for $X$ enables more flexible extraction of frequency magnitude levels while preserving the same temporal information.The learned Hyperplane-NMF components provide interpretable representations and serve as interpretable sources for the decomposed pieces.
  • Figure 3: Learned 10 Components for the ECL Data Compared to Its Magnitude Spectrum Distributions. The blue regions correspond to the 95% confidence interval of magnitude spectrum, the green columns show the mean magnitude spectrum, and the red columns indicate the weights of the learned components. The larger version is shown in Figure \ref{['components_ECL']} in the Appendix.
  • Figure 5: Learned 10 Components for the ECL data Compared to Its Magnitude Spectrum Distribution. The blue regions correspond to the 95% confidence interval of magnitude spectrum, the green columns show the mean magnitude spectrum, and the red columns indicate the weights of the learned components.
  • Figure 6: Learned 10 Components for the Traffic data Compared to Its Magnitude Spectrum Distribution. The blue regions correspond to the 95% confidence interval of magnitude spectrum, the green columns show the mean magnitude spectrum, and the red columns indicate the weights of the learned components.
  • ...and 23 more figures