Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models
Alexandre Mösching, Housen Li, Axel Munk
TL;DR
This work tackles decoding hidden state sequences in large-scale HMMs where the observation sequence is long relative to the number of states. It introduces Quick Adaptive Ternary Segmentation (QATS), a greedy divide-and-conquer decoder that uses ternary segmentation and optimistic search to achieve polylogarithmic runtime in the sequence length $n$ and cubic complexity in the number of states $m$. The method precomputes log-likelihood components, applies local two- and three-segment maximizations via subroutines OSH2 and OSH3, and yields an admissible QATS-path with substantial speedups over Viterbi and PMAP in simulations, while maintaining competitive accuracy; a reference implementation is available in the R-package QATS. Overall, QATS enables scalable decoding for large HMMs, offering practical gains in speed for applications with infrequent state changes and opening avenues for parallelization and extensions to broader change-point detection problems.
Abstract
Hidden Markov models (HMMs) are characterized by an unobservable Markov chain and an observable process -- a noisy version of the hidden chain. Decoding the original signal from the noisy observations is one of the main goals in nearly all HMM based data analyses. Existing decoding algorithms such as Viterbi and the pointwise maximum a posteriori (PMAP) algorithm have computational complexity at best linear in the length of the observed sequence, and sub-quadratic in the size of the state space of the hidden chain. We present Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure with computational complexity polylogarithmic in the length of the sequence, and cubic in the size of the state space, hence particularly suited for large scale HMMs with relatively few states. It also suggests an effective way of data storage as specific cumulative sums. In essence, the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure. Our simulations demonstrate the speedups offered by QATS in comparison to Viterbi and PMAP, along with a precision analysis. An implementation of QATS is in the R-package QATS on GitHub.
