Table of Contents
Fetching ...

Grouped Competition Test with Unified False Discovery Rate Control

Mingzhou Deng, Yan Fu

TL;DR

This work addresses the problem of multiple hypothesis testing under data heterogeneity and complex dependencies by introducing a unified competition-test framework. It advances the grouped competition (GC) filter, which partitions hypotheses into homogeneous groups, applies group-wise competition statistics, and integrates results with a data-driven correction to control the global FDR. The authors establish FDR control theorems for various forms of group statistics (GCS, IGCS, iGCS), propose data-driven grouping strategies including side-information-based grouping, and demonstrate through simulations and proteomics data that GC achieves higher power while maintaining stringent FDR control. The approach is validated in simulations across Gamma and Gaussian models and applied to mass spectrometry-based protein modification identification, indicating practical impact for high-dimensional, heterogeneous datasets.

Abstract

This paper discusses several p-value-free multiple hypothesis testing methods proposed in recent years and organizes them by introducing a unified framework termed competition test. Although existing competition tests are effective in controlling the False Discovery Rate (FDR), they struggle with handling data with strong heterogeneity or dependency structures. Based on this framework, the paper proposes a novel approach that applies a corrected competition procedure to group data with certain structure, and then integrates the results from each group. Using the favorable properties of competition test, the paper proposes a theorem demonstrating that this approach controls the global FDR. We further show that although the correction parameters may lead to a slight loss in power, such loss is typically minimal. Through simulation experiments and mass spectrometry data analysis, we illustrate the flexibility and efficacy of our approach.

Grouped Competition Test with Unified False Discovery Rate Control

TL;DR

This work addresses the problem of multiple hypothesis testing under data heterogeneity and complex dependencies by introducing a unified competition-test framework. It advances the grouped competition (GC) filter, which partitions hypotheses into homogeneous groups, applies group-wise competition statistics, and integrates results with a data-driven correction to control the global FDR. The authors establish FDR control theorems for various forms of group statistics (GCS, IGCS, iGCS), propose data-driven grouping strategies including side-information-based grouping, and demonstrate through simulations and proteomics data that GC achieves higher power while maintaining stringent FDR control. The approach is validated in simulations across Gamma and Gaussian models and applied to mass spectrometry-based protein modification identification, indicating practical impact for high-dimensional, heterogeneous datasets.

Abstract

This paper discusses several p-value-free multiple hypothesis testing methods proposed in recent years and organizes them by introducing a unified framework termed competition test. Although existing competition tests are effective in controlling the False Discovery Rate (FDR), they struggle with handling data with strong heterogeneity or dependency structures. Based on this framework, the paper proposes a novel approach that applies a corrected competition procedure to group data with certain structure, and then integrates the results from each group. Using the favorable properties of competition test, the paper proposes a theorem demonstrating that this approach controls the global FDR. We further show that although the correction parameters may lead to a slight loss in power, such loss is typically minimal. Through simulation experiments and mass spectrometry data analysis, we illustrate the flexibility and efficacy of our approach.

Paper Structure

This paper contains 46 sections, 22 theorems, 115 equations, 9 figures, 4 tables, 5 algorithms.

Key Result

Theorem 1

If random vectors $\mathbf{W},\mathbf{L}$ have conditional exchangeability, it follows that $\mathbb{E}\left[|V_+(0)|/(|V_-(0)|+1)\right]\leq r$. If the property of stopping time and martingale is satisfied at the same time, then In summary, the competition filter can control $FDR\leq\alpha$.

Figures (9)

  • Figure 1: the realized FDR for different numbers of groups
  • Figure 2: the number of correct rejections of eBH procedure across different $\alpha_{cp}$
  • Figure 3: Realized FDR and Power at target FDR levels under Gamma model. In the left column, the "$-$" line is $y=x$. The "$-\circ-$" line and the "$-\star-$" line are the realized FDR and the realized Power of OC filter and GC filter respectively. The right two columns are violinplots of the realized FDR and Power for both filters.
  • Figure 4: Realized FDR and Power at target FDR levels under independent Gaussian model.
  • Figure 5: Realized FDR and Power at target FDR levels under independent Gaussian model with partial symmetry parameter $r=1.5$.
  • ...and 4 more figures

Theorems & Definitions (24)

  • Theorem 1
  • Lemma 1
  • Theorem 2
  • Theorem 3
  • Corollary 1
  • Corollary 2
  • Theorem 4
  • Corollary 3
  • Corollary 4
  • Theorem 5
  • ...and 14 more