Table of Contents
Fetching ...

Prediction intervals for overdispersed multinomial data with application to historical controls

Sören Budig, Frank Schaarschmidt, Max Menssen

Abstract

In pharmaceutical and toxicological research, historical control data are increasingly used to validate concurrent control groups, typically via the construction of historical control limits. While methods have been described for continuous and dichotomous endpoints, approaches for overdispersed multinomial data, common in developmental and reproductive toxicology or histopathology, are currently lacking. This article introduces and compares methods for constructing simultaneous prediction intervals for future multinomial observations subject to overdispersion. We investigate a range of frequentist approaches, including asymptotic approximations and bootstrap techniques (incorporating symmetric, asymmetric, and marginal calibration, as well as rank-based methods), alongside Bayesian hierarchical models. Extensive simulation studies assessing simultaneous coverage probability and the balance of lower and upper tail error probabilities show that standard asymptotic methods and simple Bonferroni adjustments yield liberal intervals, especially for small sample sizes or rare event categories. In contrast, bootstrap methods, specifically the Marginal Calibration and Rank-Based Simultaneous Confidence Sets, provide reliable error control and equal tail probabilities across diverse scenarios involving varying cluster sizes and degrees of overdispersion. These methods fill an important gap for multinomial endpoints and support the validation of concurrent controls using historical control data, in line with the recent European Food Safety Authority scientific opinion on the use and reporting of historical control data.

Prediction intervals for overdispersed multinomial data with application to historical controls

Abstract

In pharmaceutical and toxicological research, historical control data are increasingly used to validate concurrent control groups, typically via the construction of historical control limits. While methods have been described for continuous and dichotomous endpoints, approaches for overdispersed multinomial data, common in developmental and reproductive toxicology or histopathology, are currently lacking. This article introduces and compares methods for constructing simultaneous prediction intervals for future multinomial observations subject to overdispersion. We investigate a range of frequentist approaches, including asymptotic approximations and bootstrap techniques (incorporating symmetric, asymmetric, and marginal calibration, as well as rank-based methods), alongside Bayesian hierarchical models. Extensive simulation studies assessing simultaneous coverage probability and the balance of lower and upper tail error probabilities show that standard asymptotic methods and simple Bonferroni adjustments yield liberal intervals, especially for small sample sizes or rare event categories. In contrast, bootstrap methods, specifically the Marginal Calibration and Rank-Based Simultaneous Confidence Sets, provide reliable error control and equal tail probabilities across diverse scenarios involving varying cluster sizes and degrees of overdispersion. These methods fill an important gap for multinomial endpoints and support the validation of concurrent controls using historical control data, in line with the recent European Food Safety Authority scientific opinion on the use and reporting of historical control data.
Paper Structure (21 sections, 25 equations, 3 figures, 4 tables)

This paper contains 21 sections, 25 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Simulated simultaneous coverage probability of the ten selected methods for all scenarios and settings. The color indicates the number of categories $C$ in the true probability vector and the shape represents number of clusters. The nominal level of 0.95 is represented by the solid horizontal line, and the dashed lines represent the Monte Carlo error of the simulation. The horizontal facets indicate the magnitude of overdispersion ($\phi$).
  • Figure 2: Assessment of equal tail probabilities for the lower (top row) and upper (bottom row) historical control limits of the six best-performing methods in three-category ($C=3$) scenarios. The top row shows $P(y_c \ge L_c)$ and the bottom row shows ($P(y_c \le U_c)$). Each point represents one bound-specific probability for one category. The horizontal black line indicates the nominal Bonferroni-adjusted target of $1 - \alpha/(2C) \approx 0.9917$.
  • Figure 3: Application of the proposed methods to a simulated histopathological dataset. The left panel shows the raw counts of the ten historical studies ($K=10$, distinguished by shape), illustrating between-study variability. The right panel displays the 95% simultaneous PIs for the current trial based on the six best-performing methods. The black-and-white circles indicate the predicted mean counts ($\hat{\bm{y}}$), and the red crosses indicate the actually observed counts of the simulated current study.