Table of Contents
Fetching ...

Pyramidal Hidden Markov Model For Multivariate Time Series Forecasting

YeXin Huang

TL;DR

PHMM introduces a pyramidal hidden Markov framework to capture long-range dependencies in multivariate time series by stacking multistep stochastic states with an attention mechanism. It combines an input HMM branch with a multistep, attention-enabled branch in a two-branch DAG, trained as a sequential VAE by maximizing the ELBO. Experiments on 20 UEA multivariate time series datasets and a Stocks dataset show state-of-the-art or competitive performance, especially for nonstationary and short-sample scenarios. The approach formalizes tunable hyperparameters $k$ and $m$ to balance short- and long-term dynamics, enabling robust, long-horizon forecasting of complex time series.

Abstract

The Hidden Markov Model (HMM) can predict the future value of a time series based on its current and previous values, making it a powerful algorithm for handling various types of time series. Numerous studies have explored the improvement of HMM using advanced techniques, leading to the development of several variations of HMM. Despite these studies indicating the increased competitiveness of HMM compared to other advanced algorithms, few have recognized the significance and impact of incorporating multistep stochastic states into its performance. In this work, we propose a Pyramidal Hidden Markov Model (PHMM) that can capture multiple multistep stochastic states. Initially, a multistep HMM is designed for extracting short multistep stochastic states. Next, a novel time series forecasting structure is proposed based on PHMM, which utilizes pyramid-like stacking to adaptively identify long multistep stochastic states. By employing these two schemes, our model can effectively handle non-stationary and noisy data, while also establishing long-term dependencies for more accurate and comprehensive forecasting. The experimental results on diverse multivariate time series datasets convincingly demonstrate the superior performance of our proposed PHMM compared to its competitive peers in time series forecasting.

Pyramidal Hidden Markov Model For Multivariate Time Series Forecasting

TL;DR

PHMM introduces a pyramidal hidden Markov framework to capture long-range dependencies in multivariate time series by stacking multistep stochastic states with an attention mechanism. It combines an input HMM branch with a multistep, attention-enabled branch in a two-branch DAG, trained as a sequential VAE by maximizing the ELBO. Experiments on 20 UEA multivariate time series datasets and a Stocks dataset show state-of-the-art or competitive performance, especially for nonstationary and short-sample scenarios. The approach formalizes tunable hyperparameters and to balance short- and long-term dynamics, enabling robust, long-horizon forecasting of complex time series.

Abstract

The Hidden Markov Model (HMM) can predict the future value of a time series based on its current and previous values, making it a powerful algorithm for handling various types of time series. Numerous studies have explored the improvement of HMM using advanced techniques, leading to the development of several variations of HMM. Despite these studies indicating the increased competitiveness of HMM compared to other advanced algorithms, few have recognized the significance and impact of incorporating multistep stochastic states into its performance. In this work, we propose a Pyramidal Hidden Markov Model (PHMM) that can capture multiple multistep stochastic states. Initially, a multistep HMM is designed for extracting short multistep stochastic states. Next, a novel time series forecasting structure is proposed based on PHMM, which utilizes pyramid-like stacking to adaptively identify long multistep stochastic states. By employing these two schemes, our model can effectively handle non-stationary and noisy data, while also establishing long-term dependencies for more accurate and comprehensive forecasting. The experimental results on diverse multivariate time series datasets convincingly demonstrate the superior performance of our proposed PHMM compared to its competitive peers in time series forecasting.
Paper Structure (14 sections, 13 equations, 3 figures, 4 tables)

This paper contains 14 sections, 13 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Brief structure of PHMM. By stacking these Multistep HMM Mechanisms on the Input HMM mechanism, we are able to achieve a higher accuracy and efficiency in various applications. The additional layers provide a deeper understanding and analysis of complex data patterns, allowing for more precise predictions and decision-making.
  • Figure 2: Directed Acyclic Graphs for Pyramidal Hidden Markov Model (where $k=2, m=3$) . The variable $k$ denotes the time step, the variable $m$ denotes the number of stacked layers and the latent variable $h_{t}^{i}(i=1, 2, \dots, m)$ denotes the latent variable of different level. The hidden variables independently propagate as $t$ grows. At each step, the $x_{t-1}$ generate the outputs $y_t$'s lowest latent variable, and subsequently, this lower-level latent variable is propagated to the higher level in order to generate the longer stochastic state. Note that gray variable means unobserved variable, other colored variable means observed one.
  • Figure 3: Left: The illustration of the time series architecture for our proposed PHMM. At each time step $t$, for the prior network, it takes the concatenated features(i. features via encoder on time series input; ii. the hidden variable at the last step) into PHMM (prior). The PHMM (prior) is followed with two FC layers which output the mean and log-variance vector respectively. After sampling, the prior hidden variables are obtained. For the posterior network, the input $x_{t}$ is processed by fully connected layers to extract corresponding features. Then, the features are also process by the PHMM (posterior) which is followed by two FC layers that output the mean and log-variance vector respectively. The hidden variances of Input HMM Mechanism are then fed into decoder network for regenerate the input. Finally, all hidden variables generated by both Input HMM Mechanism and Multistep HMM Mechanisms are concated to be fed into the predictor to predict the future values. Right: The structure of two basic components at each time step $t$. The Input HMM Mechanism consists of a shared GRU cell with parameter $\theta_{1}$. The Multistep HMM Mechanism consists of a shared GRU cell with parameter $\theta_{2}$ and an attention Mechanism with parameter $\pi$, which captures short-term dependencies based on $k$ hidden variables from the lower layer.