Aggregating Dependent Signals with Heavy-Tailed Combination Tests
Lin Gui, Yuchao Jiang, Jingshu Wang
TL;DR
This work analyzes heavy-tailed p-value combination tests (notably the Cauchy and harmonic mean p-values) for aggregating dependent signals under a fixed number of base tests as the global significance level $oldsymbol{\alpha}$ tends to zero. It develops a unified theory for both one-sided and two-sided p-values within the regularly varying tail framework, proving asymptotic validity under pairwise quasi-asymptotic independence and, when correlations are not perfectly aligned, asymptotic equivalence to Bonferroni for two-sided p-values. Empirical results show that under asymptotic independence, these tests behave like Bonferroni at very small $oldsymbol{\alpha}$, whereas under asymptotic dependence (e.g., multivariate $t$) they can offer substantial power gains, especially for dense signals and heavier tails (tail index $oldsymbol{\gamma} 1$). Real-data applications in circadian rhythm detection and GWAS demonstrate improved power and computational efficiency over Bonferroni, with practical recommendations such as left-truncated $t_1$ distributions to mitigate issues with negatively supported transformed statistics. Overall, the paper highlights when heavy-tailed combination tests add value and how to implement them robustly in dependent settings.
Abstract
Combining dependent p-values poses a long-standing challenge in statistical inference, particularly when aggregating findings from multiple methods to enhance signal detection. Recently, p-value combination tests based on regularly varying-tailed distributions, such as the Cauchy combination test and harmonic mean p-value, have attracted attention for their robustness to unknown dependence. This paper provides a theoretical and empirical evaluation of these methods under an asymptotic regime where the number of p-values is fixed and the global test significance level approaches zero. We examine two types of dependence among the p-values. First, when p-values are pairwise asymptotically independent, such as with bivariate normal test statistics with no perfect correlation, we prove that these combination tests are asymptotically valid. However, they become equivalent to the Bonferroni test as the significance level tends to zero for both one-sided and two-sided p-values. Empirical investigations suggest that this equivalence can emerge at moderately small significance levels. Second, under pairwise quasi-asymptotic dependence, such as with bivariate t-distributed test statistics, our simulations suggest that these combination tests can remain valid and exhibit notable power gains over Bonferroni, even as the significance level diminishes. These findings highlight the potential advantages of these combination tests in scenarios where p-values exhibit substantial dependence. Our simulations also examine how test performance depends on the support and tail heaviness of the underlying distributions.
