Table of Contents
Fetching ...

Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding

Zenia Elise Damgaard Bæk, Moisès Coll Macià, Laurits Skov, Asger Hobolth

TL;DR

This paper tackles the limitations of standard hidden Markov model decoding by introducing finite Markov chain imbedding (FMCI) to obtain exact posterior distributions of hidden-state pattern statistics and by developing hybrid decoding, which blends local and global criteria via a tuning parameter $\\alpha$. FMCI leverages the posterior as an inhomogeneous Markov chain to compute distributions for statistics such as the number of jumps, total time in a state, exact run lengths, and longest run, using structured, sparse block matrices to enable exact calculations. Hybrid decoding, formalized through a weighted geometric mean and an Artemis plot for $\\alpha$ selection, yields decoding paths that outperform both Viterbi and Posterior decoding in intermediate regimes and provide a principled way to quantify uncertainty in the decoded sequence. The methods are illustrated on classical data sets (e.g., fetal movement counts and earthquakes) with supplemental code, and robustness and block-wise accuracy analyses demonstrate practical gains for complex HMM applications. Overall, the work advances posterior pattern analysis and robust decoding for HMMs, offering scalable techniques and actionable guidance for applied researchers.

Abstract

Two major tasks in applications of hidden Markov models are to (i) compute distributions of summary statistics of the hidden state sequence, and (ii) decode the hidden state sequence. We describe finite Markov chain imbedding (FMCI) and hybrid decoding to solve each of these two tasks. In the first part of our paper we use FMCI to compute posterior distributions of summary statistics such as the number of visits to a hidden state, the total time spent in a hidden state, the dwell time in a hidden state, and the longest run length. We use simulations from the hidden state sequence, conditional on the observed sequence, to establish the FMCI framework. In the second part of our paper we apply hybrid segmentation for improved decoding of a HMM. We demonstrate that hybrid decoding shows increased performance compared to Viterbi or Posterior decoding (often also referred to as global or local decoding), and we introduce a novel procedure for choosing the tuning parameter in the hybrid procedure. Furthermore, we provide an alternative derivation of the hybrid loss function based on weighted geometric means. We demonstrate and apply FMCI and hybrid decoding on various classical data sets, and supply accompanying code for reproducibility.

Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding

TL;DR

This paper tackles the limitations of standard hidden Markov model decoding by introducing finite Markov chain imbedding (FMCI) to obtain exact posterior distributions of hidden-state pattern statistics and by developing hybrid decoding, which blends local and global criteria via a tuning parameter . FMCI leverages the posterior as an inhomogeneous Markov chain to compute distributions for statistics such as the number of jumps, total time in a state, exact run lengths, and longest run, using structured, sparse block matrices to enable exact calculations. Hybrid decoding, formalized through a weighted geometric mean and an Artemis plot for selection, yields decoding paths that outperform both Viterbi and Posterior decoding in intermediate regimes and provide a principled way to quantify uncertainty in the decoded sequence. The methods are illustrated on classical data sets (e.g., fetal movement counts and earthquakes) with supplemental code, and robustness and block-wise accuracy analyses demonstrate practical gains for complex HMM applications. Overall, the work advances posterior pattern analysis and robust decoding for HMMs, offering scalable techniques and actionable guidance for applied researchers.

Abstract

Two major tasks in applications of hidden Markov models are to (i) compute distributions of summary statistics of the hidden state sequence, and (ii) decode the hidden state sequence. We describe finite Markov chain imbedding (FMCI) and hybrid decoding to solve each of these two tasks. In the first part of our paper we use FMCI to compute posterior distributions of summary statistics such as the number of visits to a hidden state, the total time spent in a hidden state, the dwell time in a hidden state, and the longest run length. We use simulations from the hidden state sequence, conditional on the observed sequence, to establish the FMCI framework. In the second part of our paper we apply hybrid segmentation for improved decoding of a HMM. We demonstrate that hybrid decoding shows increased performance compared to Viterbi or Posterior decoding (often also referred to as global or local decoding), and we introduce a novel procedure for choosing the tuning parameter in the hybrid procedure. Furthermore, we provide an alternative derivation of the hybrid loss function based on weighted geometric means. We demonstrate and apply FMCI and hybrid decoding on various classical data sets, and supply accompanying code for reproducibility.

Paper Structure

This paper contains 21 sections, 2 theorems, 48 equations, 14 figures, 1 table.

Key Result

Theorem 1

In a hidden Markov model $(\pi,\Gamma,\Phi)$, the hidden state sequence $y=(y_1,\ldots,y_n)$ conditional on the observed sequence $x=(x_1,\ldots,x_n)$, is an inhomogeneous first-order Markov chain with transition probabilities for $t=2,\ldots,n$, where $\beta_t(y_t)=\mathds{P}(x_{t+1},\ldots,x_n|y_t)$ is the matrix of backward probabilities. The conditional initial state probabilities are given b

Figures (14)

  • Figure 1: Illustration and analysis of fetal movement count data. (a) The data consists of the number of fetal lamb movements in 5 second intervals. (b) Illustration of Theorem 1 in Section \ref{['Sec:FMCI']} of the manuscript. The red line is the probability for the hidden Markov chain of staying in state 1 at any time interval along the sequence given the observed sequence. The green line is the probability of staying in state 2 given the observed sequence. (c) Top: Viterbi and Posterior decoding are identical for this data. Bottom: Samples of 1000 hidden state sequences from the conditional Markov chain illustrated in (b). The samples give a more nuanced and informative picture of the conditional hidden Markov chain than Viterbi or Posterior decoding. (d) Pointwise empirical frequency of the 1000 samples is almost identical to the exact posterior probabilities found from the forward-backward tables.
  • Figure 2: FMCI for fetal movement count data. Analytical plots are from FMCI and empirical plots are based on summary statistics from 1000 samples. (a) Posterior distribution of the number of transitions from state 1 to state 2. (b) Posterior distribution of the number of intervals in state 2. (c) Expected number of run lengths in state 2. (d) Posterior distribution of the longest run in state 2.
  • Figure 3: A visualization of how the hybrid paths change depending on $\alpha$ for the earthquakes data (A) and a simulated example (B). The dashed line is the observed sequence. (A): Posterior decoding and Viterbi differ in two years; 1918 and 1973 (pink box 1 and 2). For $0 \leq \alpha < 0.11$ the hybrid paths are identical to Posterior decoding. For $0.11 \leq \alpha < 0.52$ the hybrid paths are identical and different from both Posterior decoding and Viterbi, and for $0.52 \leq \alpha \leq 1$ the hybrid paths are identical to Viterbi. (B): Posterior decoding and Viterbi differ at three time points (pink boxes 1-3). For $0 \leq \alpha < 0.07$, the hybrid paths are identical to Posterior decoding. For $0.07 \leq \alpha < 0.99$, the hybrid paths differ from both Posterior decoding and Viterbi, and for $0.99 \leq \alpha \leq 1$, the hybrid paths are identical to Viterbi.
  • Figure 4: An illustration of an Artemis plot. On the x-axis is the pointwise accuracy, and on the y-axis is the log-joint probability. We start in the lower right corner with Posterior decoding ($\alpha = 0$), and increase $\alpha$ until we reach Viterbi ($\alpha = 1$). We choose the tuning parameter $\alpha$ as the value that intersects the bow-shaped curve at a 45 degree angle, as illustrated in the figure. Here $\alpha=0.461$ in the intersection point between the bow and the arrow.
  • Figure 5: Graphical illustration of the result that the hidden state sequence $(y_t)_{1\leq t \leq n}$, given the data $x=(x_t)_{1 \leq t \leq n}$, is an inhomogeneous Markov chain. The dashed line illustrates the variables that we condition upon in the left hand side in the equation.
  • ...and 9 more figures

Theorems & Definitions (6)

  • Theorem 1
  • proof
  • Definition 1
  • Definition 2
  • Theorem 2
  • proof