Table of Contents
Fetching ...

Phase-type frailty models: A flexible approach to modeling unobserved heterogeneity in survival analysis

Jorge Yslas

TL;DR

This paper introduces phase-type distributions as a flexible frailty specification for survival analysis, enabling a PH-based univariate frailty and its multivariate extensions (shared and correlated). By deriving closed-form functionals and an EM-based maximum-likelihood estimation framework, the authors show that PH frailty can approximate any nonnegative frailty while preserving tractable computations. They demonstrate the approach with several numerical examples, including phase-type-Gompertz, matrix-Pareto, and lognormal settings, and compare against traditional frailties, highlighting improved fit and modeling flexibility. The work provides a practical, versatile toolkit for capturing unobserved heterogeneity in survival and related insurance contexts, with clear pathways for higher-dimensional extensions and applications.

Abstract

Frailty models are essential tools in survival analysis for addressing unobserved heterogeneity and random effects in the data. These models incorporate a random effect, the frailty, which is assumed to impact the hazard rate multiplicatively. In this paper, we introduce a novel class of frailty models in both univariate and multivariate settings, using phase-type distributions as the underlying frailty specification. We investigate the properties of these phase-type frailty models and develop expectation-maximization algorithms for their maximum-likelihood estimation. In particular, we show that the resulting model shares similarities with the Gamma frailty model, has closed-form expressions for its functionals, and can approximate any other frailty model. Through a series of simulated and real-life numerical examples, we demonstrate the effectiveness and versatility of the proposed models in addressing unobserved heterogeneity in survival analysis.

Phase-type frailty models: A flexible approach to modeling unobserved heterogeneity in survival analysis

TL;DR

This paper introduces phase-type distributions as a flexible frailty specification for survival analysis, enabling a PH-based univariate frailty and its multivariate extensions (shared and correlated). By deriving closed-form functionals and an EM-based maximum-likelihood estimation framework, the authors show that PH frailty can approximate any nonnegative frailty while preserving tractable computations. They demonstrate the approach with several numerical examples, including phase-type-Gompertz, matrix-Pareto, and lognormal settings, and compare against traditional frailties, highlighting improved fit and modeling flexibility. The work provides a practical, versatile toolkit for capturing unobserved heterogeneity in survival and related insurance contexts, with clear pathways for higher-dimensional extensions and applications.

Abstract

Frailty models are essential tools in survival analysis for addressing unobserved heterogeneity and random effects in the data. These models incorporate a random effect, the frailty, which is assumed to impact the hazard rate multiplicatively. In this paper, we introduce a novel class of frailty models in both univariate and multivariate settings, using phase-type distributions as the underlying frailty specification. We investigate the properties of these phase-type frailty models and develop expectation-maximization algorithms for their maximum-likelihood estimation. In particular, we show that the resulting model shares similarities with the Gamma frailty model, has closed-form expressions for its functionals, and can approximate any other frailty model. Through a series of simulated and real-life numerical examples, we demonstrate the effectiveness and versatility of the proposed models in addressing unobserved heterogeneity in survival analysis.

Paper Structure

This paper contains 18 sections, 2 theorems, 78 equations, 7 figures, 4 algorithms.

Key Result

Theorem 2.2

Let $V$ be any non-negative random variable. Then there exists a sequence of random variables $(Z_n)_{n\geq 1}$, where $Z_n \sim \hbox{PH}(\pmb{\pi}_n, \boldsymbol{\bm T}_n)$, $n \geq 1$, such that where $\stackrel{d}{\to}$ denotes convergence in distribution or weak convergence.

Figures (7)

  • Figure 5.1: Cross-ratio function of a shared phase-type frailty model with frailty a mixture of an exponential and an Erlang distribution (left), and cross-ratio function of a shared phase-type frailty model with frailty a generalized Coxian distribution (right). Weibull baseline hazards were employed in both cases.
  • Figure 5.2: Cross-ratio function of a shared inverse Gaussian frailty model (left), and cross-ratio function of a shared phase-type frailty model that approximates the shared inverse Gaussian model (right). Weibull baseline hazards were employed in both cases.
  • Figure 6.1: Histogram of lifetimes of the Swedish female population that died in the year 2011 at ages 50 to 100 versus density of the fitted phase-type-Gompertz frailty model and density of fitted Gompertz distribution (left) and density function of the underlying phase-type frailty (right).
  • Figure 6.2: Cumulative hazard functions of the fitted matrix-Pareto type III distribution and fitted Gamma frailty model versus the non-parametric Nelson-Aalen estimator of the sample (left) and density function of the underlying phase-type frailty (right)
  • Figure 6.3: QQ-plots of simulated sample from the shared lognormal frailty model versus fitted shared phase-type frailty model.
  • ...and 2 more figures

Theorems & Definitions (12)

  • Example 2.1: Generalized Erlang
  • Theorem 2.2
  • Remark 2.1
  • Example 3.1: Gamma frailty
  • Example 3.2: Inverse Gaussian frailty
  • Remark 3.1
  • Corollary 3.3
  • proof
  • Remark 4.1: On the structure of the phase-type parameters
  • Remark 4.2: On the computational challenges of the model estimation
  • ...and 2 more