Table of Contents
Fetching ...

Fast and scalable inference in hidden Markov models with Gaussian fields

Jan-Ole Fischer

Abstract

Hidden Markov models (HMMs) are powerful tools for analysing time series data that depend on discrete underlying but unobserved states. As such, they have gained prominence across numerous empirical disciplines, in particular ecology, medicine, and economics. However, the increasing complexity of empirical data is often accompanied by additional latent structure such as spatial effects, temporal trends, or measurement perturbations. Gaussian fields provide an attractive building block for incorporating such structured latent variation into HMMs. Fast inference methods for Gaussian fields have emerged through the stochastic partial differential equation (SPDE) approach. Due to their sparse representation, these integrate well with novel frequentist estimation methods for random-effects models via the use of automatic differentiation and the Laplace approximation. Scaling to high dimensions requires tools such as (R)TMB to exploit sparsity in the Hessian w.r.t. the latent variables - a property satisfied by SPDE fields but violated by the HMM likelihood. We present a modified forward algorithm to compute the HMM likelihood, constructing sparsity in the Hessian and consequently enabling fast and scalable inference. We demonstrate the practical feasibility and the usefulness through simulations and two case studies exploring the detection of stellar flares as well as modelling the movement of lions.

Fast and scalable inference in hidden Markov models with Gaussian fields

Abstract

Hidden Markov models (HMMs) are powerful tools for analysing time series data that depend on discrete underlying but unobserved states. As such, they have gained prominence across numerous empirical disciplines, in particular ecology, medicine, and economics. However, the increasing complexity of empirical data is often accompanied by additional latent structure such as spatial effects, temporal trends, or measurement perturbations. Gaussian fields provide an attractive building block for incorporating such structured latent variation into HMMs. Fast inference methods for Gaussian fields have emerged through the stochastic partial differential equation (SPDE) approach. Due to their sparse representation, these integrate well with novel frequentist estimation methods for random-effects models via the use of automatic differentiation and the Laplace approximation. Scaling to high dimensions requires tools such as (R)TMB to exploit sparsity in the Hessian w.r.t. the latent variables - a property satisfied by SPDE fields but violated by the HMM likelihood. We present a modified forward algorithm to compute the HMM likelihood, constructing sparsity in the Hessian and consequently enabling fast and scalable inference. We demonstrate the practical feasibility and the usefulness through simulations and two case studies exploring the detection of stellar flares as well as modelling the movement of lions.
Paper Structure (18 sections, 2 theorems, 58 equations, 13 figures, 2 algorithms)

This paper contains 18 sections, 2 theorems, 58 equations, 13 figures, 2 algorithms.

Key Result

Theorem 1

Let $\ell_T(\bm{\theta}, \bm{x})$ denote the exact log-likelihood and $\tilde{\ell}_T(\bm{\theta}, \bm{x})$ its bandwidth-$k$ forward approximation. Consider either of the following settings: Assume that $\bm\theta \in \bm\Theta$, $\bm x \in \mathcal{X}$, $\bm{\Theta} \times \mathcal{X}$ is compact, the state-dependent densities satisfy $0 < m \le f_j(\cdot) \le M < \infty$ for all $j,t$, and the

Figures (13)

  • Figure 1: Visual demonstration of the approximation employed for computing the banded forward algorithm. Each block is used twice: 1) for computing its likelihood contribution and 2) for constructing an approximate scaled forward variable for the next block's likelihood contribution.
  • Figure 2: Comparison of the Hessian matrix w.r.t. observation sequence $x_1, \dots, x_{60}$ when the log-likelihood is calculated using the regular forward algorithm (left) and using the banded forward algorithm with bandwidth $k = 5$ (right). Gray corresponds to values that are very close to zero in magnitude while white corresponds to exact zeros.
  • Figure 3: Top panel: Stacked locally decoded state probabilities ($\Pr(S_t = i \mid \bm y)$) for states quiet (black), firing (red) and decaying (orange) for a subsection of the brightness time series. Bottom panel: Corresponding brightness values, colour-coded according to the Viterbi-decoded (most-likely) state viterbi2003error. The purple line is the estimated quasi-periodic trend, i.e. posterior mode of the Gaussian process. We also show an approximate 95% credible interval for the fitted smooth in light purple, which is barely visible due to the high estimation precision.
  • Figure 4: Transition probability from active to resting, based on the posterior mode of the spatial field fitted to the lion data. Dark blue areas indicate a low probability of transitioning into the resting state, while yellow areas indicate a high probability. Pixels in the lower-right corner fall outside the triangulated mesh, so the field is predicted as zero there.
  • Figure 5: Probability of being active as a function of the time of day, obtained based on the periodically stationary state distribution. The gray band corresponds to pointwise 95% confidence intervals based on the approximate multivariate normal distribution of the MLE.
  • ...and 8 more figures

Theorems & Definitions (8)

  • Theorem 1: Geometric decay of the forward-likelihood approximation error
  • Lemma 1: Filter bound propagates to log-likelihood
  • proof
  • proof : Proof of Theorem 1: Case 1
  • proof : Proof of Theorem 1: Case 2
  • Remark 1: Compactness of $\mathcal{X}$
  • Remark 2: Boundedness of state-dependent densities
  • Remark 3: Uniform ergodicity of the Markov chain