Estimating Zero-inflated Negative Binomial GAMLSS via a Balanced Gradient Boosting Approach with an Application to Antenatal Care Data from Nigeria

Alexandra Daub; Elisabeth Bergherr

Estimating Zero-inflated Negative Binomial GAMLSS via a Balanced Gradient Boosting Approach with an Application to Antenatal Care Data from Nigeria

Alexandra Daub, Elisabeth Bergherr

TL;DR

To examine the influence of socio-economic factors on the distribution of the number of antenatal care visits in Nigeria, this work generalizes boosting of GAMLSS with shrunk optimal step lengths to base-learners beyond simple linear models and to a more complex response variable distribution.

Abstract

Statistical boosting algorithms are renowned for their intrinsic variable selection and enhanced predictive performance compared to classical statistical methods, making them especially useful for complex models such as generalized additive models for location scale and shape (GAMLSS). Boosting this model class can suffer from imbalanced updates across the distribution parameters as well as long computation times. Shrunk optimal step lengths have been shown to address these issues. To examine the influence of socio-economic factors on the distribution of the number of antenatal care visits in Nigeria, we generalize boosting of GAMLSS with shrunk optimal step lengths to base-learners beyond simple linear models and to a more complex response variable distribution. In an extensive simulation study and in the application we demonstrate that shrunk optimal step lengths yield a more balanced regularization of the overall model and enhance computational efficiency across diverse settings, in particular in the presence of base-learners penalizing the size of the fit.

Estimating Zero-inflated Negative Binomial GAMLSS via a Balanced Gradient Boosting Approach with an Application to Antenatal Care Data from Nigeria

TL;DR

Abstract

Paper Structure (25 sections, 25 equations, 19 figures, 9 tables, 1 algorithm)

This paper contains 25 sections, 25 equations, 19 figures, 9 tables, 1 algorithm.

Introduction
Boosting Generalized Additive Models for Location, Scale and Shape
Generalized Additive Models for Location, Scale and Shape
Model-based Boosting
Non-cyclical Boosting with Shrunk Optimal Step Lengths
Combining Shrunk Optimal Step Lengths with Non-linear Base-learners
Computing Shrunk Optimal Step Lengths for ZINB-GAMLSS
Simulation Study
Gaussian setting
ZINB setting
Modeling the Number of Antenatal Care Visits in Nigeria
Conclusion and Discussion
Derivation of the first-order condition for optimal step lengths in ZINB-GAMLSS
Derivation of the first-order condition of $\boldsymbol{\nu_\mu^*}$
Derivation of the first-order condition of $\boldsymbol{\nu_\alpha^*}$
...and 10 more sections

Figures (19)

Figure 1: Coefficient paths for a Gaussian location and scale model in the simulation setting without additional non-linear effect (\ref{['daub:simu_gaussian_setting']}). Dark blue paths represent informative and gray paths uninformative effects. The dashed and dotted vertical lines represent potential stopping iterations
Figure 2: Shrunk optimal step lengths for varying levels of penalization of the base-learner representing the categorical effect (columns) in the Gaussian simulation setting (\ref{['daub:simu_gaussian_setting']})
Figure 3: Comparison of the penalty parameter and the mean step length in the first 100 iterations of a simulation run for base-learners representing different effects (columns) in the Gaussian simulation setting (\ref{['daub:simu_gaussian_setting']}). For the explicit specification of the base-learners, see Sect. \ref{['daub:section_simulations']}
Figure 4: Distribution of the coefficient estimates in the Gaussian simulation setting (\ref{['daub:simu_gaussian_setting']}) with categorical effects. The red horizontal lines represent the true coefficients
Figure 5: Partial effects of the informative non-linear effect in the Gaussian simulation setting (\ref{['daub:simu_gaussian_setting']}). The red dashed lines represent the true partial effect
...and 14 more figures

Estimating Zero-inflated Negative Binomial GAMLSS via a Balanced Gradient Boosting Approach with an Application to Antenatal Care Data from Nigeria

TL;DR

Abstract

Estimating Zero-inflated Negative Binomial GAMLSS via a Balanced Gradient Boosting Approach with an Application to Antenatal Care Data from Nigeria

Authors

TL;DR

Abstract

Table of Contents

Figures (19)