Table of Contents
Fetching ...

Restricted maximum likelihood estimation in generalized linear mixed models

Luca Maestrini, Francis K. C. Hui, Alan H. Welsh

TL;DR

The paper surveys how restricted maximum likelihood (REML) concepts extend to generalized linear mixed models (GLMMs), organizing the approaches into four main classes: approximate linearization, integrated likelihood, modified profile likelihood, and direct bias correction. Across both linear and non-linear random-effects settings, these methods often yield similar REML estimates and effectively reduce finite-sample bias in variance components, as demonstrated by a comparative simulation study in binary and count GLMMs. The authors advocate wider adoption of REML in GLMMs, while noting that theoretical guarantees remain piecemeal and that practical choice should hinge on software availability and implementation ease. They also discuss REML extensions to hierarchical GLMMs (HGLMs) and highlight ongoing need for formal asymptotic results and for exploring non-REML bias-reduction strategies.

Abstract

Restricted maximum likelihood (REML) estimation is a widely accepted and frequently used method for fitting linear mixed models, with its principal advantage being that it produces less biased estimates of the variance components. However, the concept of REML does not immediately generalize to the setting of non-normally distributed responses, and it is not always clear the extent to which, either asymptotically or in finite samples, such generalizations reduce the bias of variance component estimates compared to standard unrestricted maximum likelihood estimation. In this article, we review various attempts that have been made over the past four decades to extend REML estimation in generalized linear mixed models. We establish four major classes of approaches, namely approximate linearization, integrated likelihood, modified profile likelihoods, and direct bias correction of the score function, and show that while these four classes may have differing motivations and derivations, they often arrive at a similar if not the same REML estimate. We compare the finite sample performance of these four classes, along with methods for REML estimation in hierarchical generalized linear models, through a numerical study involving binary and count data, with results demonstrating that all approaches perform similarly well reducing the finite sample size bias of variance components. Overall, we believe REML estimation should more widely adopted by practitioners using generalized linear mixed models, and that the exact choice of which REML approach to use should, at this point in time, be driven by software availability and ease of implementation.

Restricted maximum likelihood estimation in generalized linear mixed models

TL;DR

The paper surveys how restricted maximum likelihood (REML) concepts extend to generalized linear mixed models (GLMMs), organizing the approaches into four main classes: approximate linearization, integrated likelihood, modified profile likelihood, and direct bias correction. Across both linear and non-linear random-effects settings, these methods often yield similar REML estimates and effectively reduce finite-sample bias in variance components, as demonstrated by a comparative simulation study in binary and count GLMMs. The authors advocate wider adoption of REML in GLMMs, while noting that theoretical guarantees remain piecemeal and that practical choice should hinge on software availability and implementation ease. They also discuss REML extensions to hierarchical GLMMs (HGLMs) and highlight ongoing need for formal asymptotic results and for exploring non-REML bias-reduction strategies.

Abstract

Restricted maximum likelihood (REML) estimation is a widely accepted and frequently used method for fitting linear mixed models, with its principal advantage being that it produces less biased estimates of the variance components. However, the concept of REML does not immediately generalize to the setting of non-normally distributed responses, and it is not always clear the extent to which, either asymptotically or in finite samples, such generalizations reduce the bias of variance component estimates compared to standard unrestricted maximum likelihood estimation. In this article, we review various attempts that have been made over the past four decades to extend REML estimation in generalized linear mixed models. We establish four major classes of approaches, namely approximate linearization, integrated likelihood, modified profile likelihoods, and direct bias correction of the score function, and show that while these four classes may have differing motivations and derivations, they often arrive at a similar if not the same REML estimate. We compare the finite sample performance of these four classes, along with methods for REML estimation in hierarchical generalized linear models, through a numerical study involving binary and count data, with results demonstrating that all approaches perform similarly well reducing the finite sample size bias of variance components. Overall, we believe REML estimation should more widely adopted by practitioners using generalized linear mixed models, and that the exact choice of which REML approach to use should, at this point in time, be driven by software availability and ease of implementation.
Paper Structure (15 sections, 32 equations, 4 figures)

This paper contains 15 sections, 32 equations, 4 figures.

Figures (4)

  • Figure 1: Boxplots of bias of the estimates of $\Sigma_{11}$ from the simulation study involving binary responses generated from a logistic GLMM. In each panel, the first five boxplots are based on unrestricted maximum likelihood estimation methods, while the remaining five boxplots are based on implementations of REML. Here and in the following figures, the codes 'PQL', 'TMB', 'MPL', DBC' and 'HGLM' respectively refer to estimation performed via the function glmmPQL from the R package MASS, the function glmmTMB from the R package glmmTMB, the code of bellio2011restricted for the modified profile likelihood method, the code of liao2002type for the direct bias correction method, and the function dhglmfit from the R package dhglm.
  • Figure 2: Boxplots of bias of the estimates of $\Sigma_{12}$ from the simulation study involving binary responses generated from a logistic GLMM. In each panel, the first four boxplots are based on unrestricted maximum likelihood estimation methods, while the remaining four boxplots are based on implementations of REML. The codes have the same meaning as in Figure \ref{['fig:Sigma11BiasBinWRTtrueSigma']}. Note the function dhglmfit from the R package dhglm does not include an option for correlated random effects, therefore the 'HGLM' and 'HGLM REML' results are not available for $\Sigma_{12}$.
  • Figure 3: Boxplots of bias of the estimates of $\Sigma_{22}$ from the simulation study involving binary responses generated from a logistic GLMM. In each panel, the first five boxplots are based on unrestricted maximum likelihood estimation methods, while the remaining five boxplots are based on implementations of REML. The codes have the same meaning as in Figure \ref{['fig:Sigma11BiasBinWRTtrueSigma']}.
  • Figure 4: Boxplots of bias of the estimates of $\sigma^2$ from the simulation study involving count responses generated from a Poisson GLMM. In each panel, the first five boxplots are based on unrestricted maximum likelihood estimation methods, while the remaining five boxplots are based on implementations of REML. The codes have the same meaning as in Figure \ref{['fig:Sigma11BiasBinWRTtrueSigma']}.