Table of Contents
Fetching ...

Predictive Power Analysis of Multiple Test Procedures Under Arbitrary Dependence

George Karabatsos

Abstract

Many statistical problems can be addressed by applying a multiple testing procedure (MTP) that controls either the Family-wise Error Rate (FWER) or False Discovery Rate (FDR) under unknown arbitrarily-interdependent $p$-values, without explicitly modeling these inter-correlations. They include the FWER-controlling Bonferroni (1936) MTP and Holm (1979) MTP; the FDR-controlling Benjamini and Yekutieli (2001) MTP; and the DP-MTP (Karabatsos, 2025), based on a Dirichlet process (DP) prior distribution supporting the entire space of MTPs that control either the FWER or FDR. For such an MTP, this study introduces a new and congenial method for Bayesian predictive power analysis, for power calculation and sample size determination for any given planned future (e.g., replication or interim) study. This novel MTP predictive power analysis method is based on a joint prior distribution defining a scale matrix mixture of asymmetric multivariate normal mean-variance mixture distributions, factorized as a general prior distribution for effect sizes (e.g., obtained from expert judgment or results of prior studies), and a uniform prior distribution for correlation matrices representing arbitrary dependencies between $p$-values of test statistics of given multiple hypothesis tests under their alternative hypotheses. The new MTP power analysis method also results in $p$-value weights which can be used to minimize the relative impacts of and assess for significance-chasing biases (e.g., publication bias, $p$-hacking, etc.) in multiple testing, without needing to assume that $p$-values (effect sizes) are independent. The new simulation-based MTP predictive power analysis method is illustrated through the analysis of $p$-values obtained by a famous study of lead exposure and re-analyzed by the previous MTP literature, using R package bnpMTP.

Predictive Power Analysis of Multiple Test Procedures Under Arbitrary Dependence

Abstract

Many statistical problems can be addressed by applying a multiple testing procedure (MTP) that controls either the Family-wise Error Rate (FWER) or False Discovery Rate (FDR) under unknown arbitrarily-interdependent -values, without explicitly modeling these inter-correlations. They include the FWER-controlling Bonferroni (1936) MTP and Holm (1979) MTP; the FDR-controlling Benjamini and Yekutieli (2001) MTP; and the DP-MTP (Karabatsos, 2025), based on a Dirichlet process (DP) prior distribution supporting the entire space of MTPs that control either the FWER or FDR. For such an MTP, this study introduces a new and congenial method for Bayesian predictive power analysis, for power calculation and sample size determination for any given planned future (e.g., replication or interim) study. This novel MTP predictive power analysis method is based on a joint prior distribution defining a scale matrix mixture of asymmetric multivariate normal mean-variance mixture distributions, factorized as a general prior distribution for effect sizes (e.g., obtained from expert judgment or results of prior studies), and a uniform prior distribution for correlation matrices representing arbitrary dependencies between -values of test statistics of given multiple hypothesis tests under their alternative hypotheses. The new MTP power analysis method also results in -value weights which can be used to minimize the relative impacts of and assess for significance-chasing biases (e.g., publication bias, -hacking, etc.) in multiple testing, without needing to assume that -values (effect sizes) are independent. The new simulation-based MTP predictive power analysis method is illustrated through the analysis of -values obtained by a famous study of lead exposure and re-analyzed by the previous MTP literature, using R package bnpMTP.
Paper Structure (6 sections, 1 theorem, 17 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 6 sections, 1 theorem, 17 equations, 1 figure, 1 table, 1 algorithm.

Key Result

Theorem 5.1

The estimator $\bar{h}_S$ almost surely (a.s.) converges $\bar{h}_S\overset{a.s.}{\rightarrow}\mathbb{E}_f\{h(\bm{T})\}$ by the Strong Law of Large Numbers (SLLN), and $v_S$ is an estimate of the variance:

Figures (1)

  • Figure 1: For the 41 tests, comparing marginal powers and significance chasing biases between 4 MTPs.

Theorems & Definitions (2)

  • Theorem 5.1
  • proof