Table of Contents
Fetching ...

Robust Power and Sample Size Calculations in Quasi-likelihood Models: Methods and Practice

Shijie Yuan, Amy Cochran, Paul Rathouz

Abstract

Accurate power and sample size (PSS) calculations are essential for designing studies that use quasi-likelihood (QL) models, which extend generalized linear models (GLMs) to settings where the full distribution of the outcome is not specified. Traditional PSS approaches often rely on restrictive distributional assumptions, limiting their applicability when responses have non-standard distributions, variance functions are misspecified, or when predictors exhibit complex dependence structures. Building on recent advances in effect size measures for PSS - specifically, 2 Standard Deviations in the Linear Predictor (2SLiP) and Pseudo-Partial $R^2$ (P2R2) - developed with interpretability in mind, this paper extends and evaluates these effect size measures in the QL framework, keying in particular on their utility in PSS. We assess their empirical performance for the Wald test and then extend to the score test through extensive simulations across diverse outcome types, link functions, and variance structures. To illustrate practical utility, we applied these effect size measures to survey data on frontline health care workers from \citet{cahill2022occupational} to quantify the association between perceived personal protective equipment adequacy and mental health outcomes during the COVID-19 pandemic, adjusting for covariates. Our findings demonstrate that both 2SLiP and P2R2 provide robust and interpretable alternatives to traditional methods, maintaining accuracy with minimal distributional assumptions and enhancing the flexibility of PSS for realistic study designs.

Robust Power and Sample Size Calculations in Quasi-likelihood Models: Methods and Practice

Abstract

Accurate power and sample size (PSS) calculations are essential for designing studies that use quasi-likelihood (QL) models, which extend generalized linear models (GLMs) to settings where the full distribution of the outcome is not specified. Traditional PSS approaches often rely on restrictive distributional assumptions, limiting their applicability when responses have non-standard distributions, variance functions are misspecified, or when predictors exhibit complex dependence structures. Building on recent advances in effect size measures for PSS - specifically, 2 Standard Deviations in the Linear Predictor (2SLiP) and Pseudo-Partial (P2R2) - developed with interpretability in mind, this paper extends and evaluates these effect size measures in the QL framework, keying in particular on their utility in PSS. We assess their empirical performance for the Wald test and then extend to the score test through extensive simulations across diverse outcome types, link functions, and variance structures. To illustrate practical utility, we applied these effect size measures to survey data on frontline health care workers from \citet{cahill2022occupational} to quantify the association between perceived personal protective equipment adequacy and mental health outcomes during the COVID-19 pandemic, adjusting for covariates. Our findings demonstrate that both 2SLiP and P2R2 provide robust and interpretable alternatives to traditional methods, maintaining accuracy with minimal distributional assumptions and enhancing the flexibility of PSS for realistic study designs.
Paper Structure (32 sections, 59 equations, 8 figures, 2 algorithms)

This paper contains 32 sections, 59 equations, 8 figures, 2 algorithms.

Figures (8)

  • Figure 1: The empirical type I error rates and power of the Wald test based on the simulated count data for three cases of outcome variables, using a log link and sample sizes $n$, $n_{\phi}$, and $n_R$. The X-axis represents the correlation parameter $\rho$ used to generate the Gaussian copula. The shaded envelopes represent the 95% confidence intervals of the empirical type I error rate and empirical power, respectively. The two horizontal lines in each panel indicate the 95% confidence interval around the corresponding target level (0.05 for type I error rate and 0.8 for power), serving as a benchmark for good calibration. Results falling within these horizontal lines indicate well-calibrated performance.
  • Figure 2: The empirical type I error rates and power of the Wald test based on the simulated count data for three cases of outcome variables, using a identity link and sample sizes $n$, $n_{\phi}$, and $n_R$. The X-axis represents the correlation parameter $\rho$ used to generate the Gaussian copula. The shaded envelopes represent the 95% confidence intervals of the empirical type I error rate and empirical power, respectively. The two horizontal lines in each panel indicate the 95% confidence interval around the corresponding target level (0.05 for type I error rate and 0.8 for power), serving as a benchmark for good calibration. Results falling within these horizontal lines indicate well-calibrated performance.
  • Figure 3: The empirical type I error rates and power of the Wald test based on the simulated positive continuous data for two cases of outcome variables, using a log link and sample sizes $n$, $n_{\phi}$, and $n_R$. The X-axis represents the correlation parameter $\rho$ used to generate the Gaussian copula. The shaded envelopes represent the 95% confidence intervals of the empirical type I error rate and empirical power, respectively. The two horizontal lines in each panel indicate the 95% confidence interval around the corresponding target level (0.05 for type I error rate and 0.8 for power), serving as a benchmark for good calibration. Results falling within these horizontal lines indicate well-calibrated performance.
  • Figure 4: The empirical type I error rates and power of the score test based on the simulated count data for three cases of outcome variables, using a log link and sample sizes $n_s$, $n_{\phi}$, and $n_{R}$. The X-axis represents the correlation parameter $\rho$ used to generate the Gaussian copula. The shaded envelopes represent the 95% confidence intervals of the empirical type I error rate and empirical power, respectively. The two horizontal lines in each panel indicate the 95% confidence interval around the corresponding target level (0.05 for type I error rate and 0.8 for power), serving as a benchmark for good calibration. Results falling within these horizontal lines indicate well-calibrated performance.
  • Figure 5: Performance of effect size measures and corresponding sample sizes across scaled predictor effects.
  • ...and 3 more figures