Restricted maximum likelihood estimation in generalized linear mixed models
Luca Maestrini, Francis K. C. Hui, Alan H. Welsh
TL;DR
The paper surveys how restricted maximum likelihood (REML) concepts extend to generalized linear mixed models (GLMMs), organizing the approaches into four main classes: approximate linearization, integrated likelihood, modified profile likelihood, and direct bias correction. Across both linear and non-linear random-effects settings, these methods often yield similar REML estimates and effectively reduce finite-sample bias in variance components, as demonstrated by a comparative simulation study in binary and count GLMMs. The authors advocate wider adoption of REML in GLMMs, while noting that theoretical guarantees remain piecemeal and that practical choice should hinge on software availability and implementation ease. They also discuss REML extensions to hierarchical GLMMs (HGLMs) and highlight ongoing need for formal asymptotic results and for exploring non-REML bias-reduction strategies.
Abstract
Restricted maximum likelihood (REML) estimation is a widely accepted and frequently used method for fitting linear mixed models, with its principal advantage being that it produces less biased estimates of the variance components. However, the concept of REML does not immediately generalize to the setting of non-normally distributed responses, and it is not always clear the extent to which, either asymptotically or in finite samples, such generalizations reduce the bias of variance component estimates compared to standard unrestricted maximum likelihood estimation. In this article, we review various attempts that have been made over the past four decades to extend REML estimation in generalized linear mixed models. We establish four major classes of approaches, namely approximate linearization, integrated likelihood, modified profile likelihoods, and direct bias correction of the score function, and show that while these four classes may have differing motivations and derivations, they often arrive at a similar if not the same REML estimate. We compare the finite sample performance of these four classes, along with methods for REML estimation in hierarchical generalized linear models, through a numerical study involving binary and count data, with results demonstrating that all approaches perform similarly well reducing the finite sample size bias of variance components. Overall, we believe REML estimation should more widely adopted by practitioners using generalized linear mixed models, and that the exact choice of which REML approach to use should, at this point in time, be driven by software availability and ease of implementation.
