Table of Contents
Fetching ...

Summarising mortality data with a time-dependent beta latent variable model

Pedro Menezes de Araújo, Isobel Claire Gormley, Thomas Brendan Murphy

Abstract

Age-specific probabilities of death provide a snapshot of population mortality at the country level at a given point in time. Due to the high dimensionality of the data, summarising mortality information is essential for various analyses, such as visualisation and clustering. We propose the use of beta latent variable (BLV) models to summarise mortality information without data transformation. A time-dependent version of the BLV model is developed by incorporating an autoregressive prior for the latent effects. This model aims to represent mortality data with a small set of $K$ latent effects while accounting for time dependence between these effects. Inference is performed using Bayesian methods, with posterior samples generated via Hamiltonian Monte Carlo. The BLV model is applied to probabilities of death from the Human Mortality Database, covering 41 countries and 23 age-specific probabilities of death over several periods. The time-dependent BLV model with $K=6$ latent effects accurately reconstructs observed mortality data, and the model parameters have intuitive and insightful interpretations. The time-dependent BLV outperforms the standard Gaussian factor analysis model applied to logit probability of death, and demonstrates that BLV models can effectively summarise mortality data.

Summarising mortality data with a time-dependent beta latent variable model

Abstract

Age-specific probabilities of death provide a snapshot of population mortality at the country level at a given point in time. Due to the high dimensionality of the data, summarising mortality information is essential for various analyses, such as visualisation and clustering. We propose the use of beta latent variable (BLV) models to summarise mortality information without data transformation. A time-dependent version of the BLV model is developed by incorporating an autoregressive prior for the latent effects. This model aims to represent mortality data with a small set of latent effects while accounting for time dependence between these effects. Inference is performed using Bayesian methods, with posterior samples generated via Hamiltonian Monte Carlo. The BLV model is applied to probabilities of death from the Human Mortality Database, covering 41 countries and 23 age-specific probabilities of death over several periods. The time-dependent BLV model with latent effects accurately reconstructs observed mortality data, and the model parameters have intuitive and insightful interpretations. The time-dependent BLV outperforms the standard Gaussian factor analysis model applied to logit probability of death, and demonstrates that BLV models can effectively summarise mortality data.

Paper Structure

This paper contains 24 sections, 13 equations, 14 figures, 3 tables.

Figures (14)

  • Figure 1: Probabilities of death on the original (a) and logit (b) scale for all countries for some selected periods.
  • Figure 2: Estimated Kendall's $\tau$ coefficient for the probability of death trend for different age groups and countries.
  • Figure 3: Correlation matrix of age group mortalities, calculated using all country–year pairs as observations and age-group probabilities of death as variables.
  • Figure 4: Boxplots of $\text{BIC}_{m}$ , $\text{WAIC}_{c}$ and $\log(\kappa)$ for each replicate and true $K \in \{2, 4\}$.
  • Figure 5: Estimated values with 95% HPD for $\alpha_{xk}$ in (a) and $\theta_{ik}^{(t)}$ in (b) for all replicates when $K$ is fixed at its true value of 2 or 4.
  • ...and 9 more figures