Table of Contents
Fetching ...

Unification of Rare and Weak Detection Models using Moderate Deviations Analysis and Log-Chisquared P-values

Alon Kipnis

TL;DR

This work develops and unifies Rare and Weak Detection models through Rare Moderate Departures (RMD) analyzed via a log-chisquared P-value framework. It shows that under the global null, $-2\log(p_i)$ behaves like Exp$(2)$, while a vanishingly small fraction follow a scaled noncentral chi-squared tail $Q_i^{(n)}\overset{D}{=}(\mu_n(\rho)+\sigma Z)^2$ with $\mu_n(\rho)=\sqrt{2\rho\log(n)}$, enabling precise phase-transition characterizations. The authors derive the optimal testing regime: Higher Criticism and Berk-Jones achieve maximal asymptotic power outside a delineated powerless region defined by $\rho^*(\beta,\sigma)$, while traditional approaches such as Bonferroni, BH-FDR, minimal P-value, and Fisher's method are suboptimal in general. They also provide extensive model instantiations (heteroscedastic normal, Poisson, binomial) and show log-chisquared tails offer a better approximation than log-normal under moderate deviations, with empirical evidence supporting the theory. The framework yields new results on two-sample heteroscedastic models and perturbed binomial experiments, offering a cohesive, theory-driven guide for high-dimensional inference in sparse settings.

Abstract

Rare and Weak models for multiple hypothesis testing assume that only a small proportion of the tested hypotheses concern non-null effects and the individual effects are only moderately large, so they generally do not stand out individually, for example in a Bonferroni analysis. Such models have been studied in quite a few settings, for example in some cases studies focused on an underlying Gaussian means model for the hypotheses being tested; in others, Poisson and Binomial. Such seemingly different models have the following common structure. Summarizing the evidence of individual tests by the negative logarithm of its P-value, the model is asymptotically equivalent to a situation in which most negative log P-values have a standard exponential distribution but a small fraction might have an alternative distribution which is approximately noncentral chisquared on one degree of freedom. We characterize the asymptotic performance of global tests combining asymptotic log-chisquared P-values in terms of the chisquared mixture parameters: the scaling parameter controlling heteroscedasticity, the non-centrality parameter, and the parameter controlling the rarity of individual non-null effects. In a phase space involving the last two parameters, we derive a region where all tests are asymptotically powerless. Outside of this region, the Berk-Jones and the Higher Criticism tests have maximal power. Inference techniques based on the minimal P-value, false-discovery rate controlling, and Fisher's combination test have sub-optimal asymptotic phase diagrams. Our analysis yields the asymptotic power of global testing in various new rare and weak models, including two-sample heteroscedastic normal mixtures and binomial experiments with perturbed probabilities of success.

Unification of Rare and Weak Detection Models using Moderate Deviations Analysis and Log-Chisquared P-values

TL;DR

This work develops and unifies Rare and Weak Detection models through Rare Moderate Departures (RMD) analyzed via a log-chisquared P-value framework. It shows that under the global null, behaves like Exp, while a vanishingly small fraction follow a scaled noncentral chi-squared tail with , enabling precise phase-transition characterizations. The authors derive the optimal testing regime: Higher Criticism and Berk-Jones achieve maximal asymptotic power outside a delineated powerless region defined by , while traditional approaches such as Bonferroni, BH-FDR, minimal P-value, and Fisher's method are suboptimal in general. They also provide extensive model instantiations (heteroscedastic normal, Poisson, binomial) and show log-chisquared tails offer a better approximation than log-normal under moderate deviations, with empirical evidence supporting the theory. The framework yields new results on two-sample heteroscedastic models and perturbed binomial experiments, offering a cohesive, theory-driven guide for high-dimensional inference in sparse settings.

Abstract

Rare and Weak models for multiple hypothesis testing assume that only a small proportion of the tested hypotheses concern non-null effects and the individual effects are only moderately large, so they generally do not stand out individually, for example in a Bonferroni analysis. Such models have been studied in quite a few settings, for example in some cases studies focused on an underlying Gaussian means model for the hypotheses being tested; in others, Poisson and Binomial. Such seemingly different models have the following common structure. Summarizing the evidence of individual tests by the negative logarithm of its P-value, the model is asymptotically equivalent to a situation in which most negative log P-values have a standard exponential distribution but a small fraction might have an alternative distribution which is approximately noncentral chisquared on one degree of freedom. We characterize the asymptotic performance of global tests combining asymptotic log-chisquared P-values in terms of the chisquared mixture parameters: the scaling parameter controlling heteroscedasticity, the non-centrality parameter, and the parameter controlling the rarity of individual non-null effects. In a phase space involving the last two parameters, we derive a region where all tests are asymptotically powerless. Outside of this region, the Berk-Jones and the Higher Criticism tests have maximal power. Inference techniques based on the minimal P-value, false-discovery rate controlling, and Fisher's combination test have sub-optimal asymptotic phase diagrams. Our analysis yields the asymptotic power of global testing in various new rare and weak models, including two-sample heteroscedastic normal mixtures and binomial experiments with perturbed probabilities of success.

Paper Structure

This paper contains 45 sections, 21 theorems, 210 equations, 5 figures, 1 table.

Key Result

Theorem 1

Consider the hypothesis testing problem eq:hyp_log_n_appx. For every $i=1,\ldots,n$, assume that $Q_i^{(n)}$ is absolutely continuous with respect to $E_i^{(n)}$, and let be the likelihood ratio between the mixture components. Denote Suppose that there exists $\gamma>0$ such that, for any $q\in(r,1+\gamma)$, If $\rho< \rho^*(\beta,\sigma)$, all tests are asymptotically powerless.

Figures (5)

  • Figure 1: Phase Diagram. The phase transition curve $\rho^*(\beta,\sigma)$ of \ref{['eq:rho']} defines the detection boundary in all Rare Moderate Departure models. For $\rho < \rho^*(\beta,\sigma)$, all tests are asymptotically powerless. For $\rho > \rho^*(\beta,\sigma)$, some tests, including Higher Criticism and Berk-Johns, are asymptotically powerful.
  • Figure 2: Two-Sample Phase Diagram. The phase transition curve $\rho_{\mathsf{two-sample}}^*(\beta,s)$ of \ref{['eq:rho_twosample']} defines the detection boundary for an asymptotically log-chisquared perturbation model \ref{['eq:hyp_log_n']}. For $r< \rho_{\mathsf{two-sample}}^*(\beta,s)$, all tests are powerless. For $r > \rho_{\mathsf{two-sample}}^*(\beta,s)$, the Higher Criticism and the Berk-Jones tests are asymptotically powerful. The faint lines correspond to $2\rho^*(\beta,s)$, where we have $\rho_{\mathsf{two-sample}}^*(\beta,1)=2\rho^*(\beta,1)$.
  • Figure 3: Phase transitions of multiple binomial experiments with perturbed success probabilities of \ref{['eq:binomial_data']}. The phase transition curve $\rho_{\mathsf{Bin}}^*(\beta,s)$ of \ref{['eq:rho_bin']} defines the detection boundary in the multiple binomials model of \ref{['eq:binomial_data']} under the calibration \ref{['eq:binomial_calibration']}. The parameter $s$ controls the number of experiments in individual binomial trials according to \ref{['eq:binomial_calibration']} (larger $s$ means fewer trials). $\rho_{\mathsf{Bin}}^*(\beta,s=9/4)$ asymptotes to the dashed line. The case $s \to 0$ corresponds to the homoscedastic case analyzed in mukherjee2015hypothesis.
  • Figure 4: Comparing log-normal and log-chisquared approximations to moderately perturbed P-values $\pi_1,\ldots,\pi_n$. Here $\pi_i \sim \bar{\Phi}(X_i)$, $X_i=\mathcal{N}(\sqrt{2r\log(n)},1)$, with $n=10^3$ (top) and $n=10^5$ (bottom). Left: histogram of $\{-2\log(\pi_i)\}_{i=1}^n$. Middle: QQ-plots of the empirical distribution of $\{-2\log(\pi_i)\}_{i=1}^n$ against the noncentral chisquared distribution. Right: QQ-plots of the empirical distribution of $\{-2\log(\pi_i)\}_{i=1}^n$ against the normal distribution.
  • Figure 5: Comparing the fit of the empirical distribution of moderately perturbed P-values to the normal and noncentral chisquared distributions. Both panels show the rejection rate of the Anderson-Darling (AD) Goodness-of-fit test at a significance level of $0.05$ (a smaller rejection rate indicates a better fit). Left: rejection rate versus perturbation intensity parameter $r$; $n=1,000$ is fixed. Right: rejection rate versus sample size $n$; $r=2$ is fixed.

Theorems & Definitions (27)

  • Theorem 1
  • Corollary 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Proposition 7
  • Proposition 8
  • Proposition 9
  • ...and 17 more