Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding
Zenia Elise Damgaard Bæk, Moisès Coll Macià, Laurits Skov, Asger Hobolth
TL;DR
This paper tackles the limitations of standard hidden Markov model decoding by introducing finite Markov chain imbedding (FMCI) to obtain exact posterior distributions of hidden-state pattern statistics and by developing hybrid decoding, which blends local and global criteria via a tuning parameter $\\alpha$. FMCI leverages the posterior as an inhomogeneous Markov chain to compute distributions for statistics such as the number of jumps, total time in a state, exact run lengths, and longest run, using structured, sparse block matrices to enable exact calculations. Hybrid decoding, formalized through a weighted geometric mean and an Artemis plot for $\\alpha$ selection, yields decoding paths that outperform both Viterbi and Posterior decoding in intermediate regimes and provide a principled way to quantify uncertainty in the decoded sequence. The methods are illustrated on classical data sets (e.g., fetal movement counts and earthquakes) with supplemental code, and robustness and block-wise accuracy analyses demonstrate practical gains for complex HMM applications. Overall, the work advances posterior pattern analysis and robust decoding for HMMs, offering scalable techniques and actionable guidance for applied researchers.
Abstract
Two major tasks in applications of hidden Markov models are to (i) compute distributions of summary statistics of the hidden state sequence, and (ii) decode the hidden state sequence. We describe finite Markov chain imbedding (FMCI) and hybrid decoding to solve each of these two tasks. In the first part of our paper we use FMCI to compute posterior distributions of summary statistics such as the number of visits to a hidden state, the total time spent in a hidden state, the dwell time in a hidden state, and the longest run length. We use simulations from the hidden state sequence, conditional on the observed sequence, to establish the FMCI framework. In the second part of our paper we apply hybrid segmentation for improved decoding of a HMM. We demonstrate that hybrid decoding shows increased performance compared to Viterbi or Posterior decoding (often also referred to as global or local decoding), and we introduce a novel procedure for choosing the tuning parameter in the hybrid procedure. Furthermore, we provide an alternative derivation of the hybrid loss function based on weighted geometric means. We demonstrate and apply FMCI and hybrid decoding on various classical data sets, and supply accompanying code for reproducibility.
