Table of Contents
Fetching ...

Extending the saemix package for R to fit non Gaussian outcomes

Emmanuelle Comets, Maud Delattre, Belhal Karimi

TL;DR

This paper extends the saemix package for R to accommodate a variety of models for non-Gaussian data, and shows good performance to recover the true parameter values in the simulation study, and was stable across different starting values for the parameters.

Abstract

Background and Objectives: Longitudinal data are increasingly collected in clinical trials to provide information on treatment action and disease evolution. The trajectory of continuous biomarkers such as target hormone concentrations or viral loads can then be modelled in relationship to the occurrence of events such as recovery or hospitalisation. Other studies may include repeated measurements of discrete pain scores, number of episodes (count) or occurrence of events (survival). Non-linear mixed-effect models (NLMEM) can handle individual differences in trajectories while modelling the underlying population evolution and are the natural choice for their analysis. The saemix package for R is one of the few open-source solutions and the most flexible. In this paper, we extend it to accommodate a variety of models for non-Gaussian data. Methods: The saemix package estimates parameters through the Stochastic Approximation Expectation-Maximisation (SAEM) algorithm. Within the package, non-Gaussian models are specified by their log-likelihood functions, affording maximal control over model formulation. We extend estimation algorithms as well as exploratory and diagnostic plots for non-Gaussian data. Bootstrap approaches were implemented to estimate parameter uncertainty. To evaluate the performance of saemix, we performed a simulation study based on the toenail dataset, containing repeated binary data from a randomised clinical trial. Results: saemix showed good performance to recover the true parameter values in the simulation study, and was stable across different starting values for the parameters. An algorithm jointly searching for covariate and interindividual variability model was also implemented to build the covariate model and applied to categorical and survival-type data.

Extending the saemix package for R to fit non Gaussian outcomes

TL;DR

This paper extends the saemix package for R to accommodate a variety of models for non-Gaussian data, and shows good performance to recover the true parameter values in the simulation study, and was stable across different starting values for the parameters.

Abstract

Background and Objectives: Longitudinal data are increasingly collected in clinical trials to provide information on treatment action and disease evolution. The trajectory of continuous biomarkers such as target hormone concentrations or viral loads can then be modelled in relationship to the occurrence of events such as recovery or hospitalisation. Other studies may include repeated measurements of discrete pain scores, number of episodes (count) or occurrence of events (survival). Non-linear mixed-effect models (NLMEM) can handle individual differences in trajectories while modelling the underlying population evolution and are the natural choice for their analysis. The saemix package for R is one of the few open-source solutions and the most flexible. In this paper, we extend it to accommodate a variety of models for non-Gaussian data. Methods: The saemix package estimates parameters through the Stochastic Approximation Expectation-Maximisation (SAEM) algorithm. Within the package, non-Gaussian models are specified by their log-likelihood functions, affording maximal control over model formulation. We extend estimation algorithms as well as exploratory and diagnostic plots for non-Gaussian data. Bootstrap approaches were implemented to estimate parameter uncertainty. To evaluate the performance of saemix, we performed a simulation study based on the toenail dataset, containing repeated binary data from a randomised clinical trial. Results: saemix showed good performance to recover the true parameter values in the simulation study, and was stable across different starting values for the parameters. An algorithm jointly searching for covariate and interindividual variability model was also implemented to build the covariate model and applied to categorical and survival-type data.
Paper Structure (43 sections, 27 equations, 18 figures, 9 tables)

This paper contains 43 sections, 27 equations, 18 figures, 9 tables.

Figures (18)

  • Figure 1: Time-course of the proportion of subjects with toenail infection at each visit, stratified by treatment.
  • Figure 2: Relative estimation errors for the parameters in the original design (left) and the design with variability on both parameters (right). Dashed lines delineate absolute relative biases within 10% and dotted lines denote biases within 5%. The red insert highlights the interval corresponding to the mean (red dot) plus or minus 2 standard deviations. The graph was trimmed to +/-100, omitting 26 values for $\beta$.
  • Figure 3: Visual Predictive Check for each value of the score in the proportional odds model with covariates, stratified by treatment. VPC are produced by simulating descriptive statistics under the model and design of the original dataset and comparing them with the observed values. Here, we simulated the scores for 1000 datasets under the same design, then computed the 95% prediction intervals of the proportion of each scores over time in each treatment group. We then overlayed the observed proportion.
  • Figure 4: Visual Predictive Checks for the ZIP model, regrouping the high count categories and stratifying on gender.
  • Figure 5: Survival in the lung cancer data, stratified by gender (left) or ECOG assessment (right). The same graphs could be obtained in months instead of days by transforming the data beforehand, but since the dataset is in days we keep this unit throughout the analysis.
  • ...and 13 more figures