Table of Contents
Fetching ...

An online generalization of the (e-)Benjamini-Hochberg procedure

Lasse Fischer, Ziyu Xu, Aaditya Ramdas

TL;DR

This paper considers a relaxed problem setup that allows the current hypothesis to be rejected at any later step, and shows that this relaxation allows the natural and appropriate online extension of the BH and e-BH procedures to be defined.

Abstract

In online multiple testing, the hypotheses arrive one by one, and at each time we must immediately reject or accept the current hypothesis solely based on the data and hypotheses observed so far. Many online procedures have been proposed, but none of them are generalizations of the Benjamini-Hochberg (BH) procedure based on p-values, or of the e-BH procedure that uses e-values. In this paper, we consider a relaxed problem setup that allows the current hypothesis to be rejected at any later step. We show that this relaxation allows us to define -- what we justify extensively to be -- the natural and appropriate online extension of the BH and e-BH procedures. We show that the FDR guarantees for BH (resp. e-BH) and online BH (resp. online e-BH) are identical under positive, negative or arbitrary dependence, at fixed and stopping times. Further, the online BH (resp. online e-BH) rule recovers the BH (resp. e-BH) rule as a special case when the number of hypotheses is known to be fixed. Of independent interest, our proof techniques also allow us to prove that numerous existing online procedures, which were known to control the FDR at fixed times, also control the FDR at stopping times.

An online generalization of the (e-)Benjamini-Hochberg procedure

TL;DR

This paper considers a relaxed problem setup that allows the current hypothesis to be rejected at any later step, and shows that this relaxation allows the natural and appropriate online extension of the BH and e-BH procedures to be defined.

Abstract

In online multiple testing, the hypotheses arrive one by one, and at each time we must immediately reject or accept the current hypothesis solely based on the data and hypotheses observed so far. Many online procedures have been proposed, but none of them are generalizations of the Benjamini-Hochberg (BH) procedure based on p-values, or of the e-BH procedure that uses e-values. In this paper, we consider a relaxed problem setup that allows the current hypothesis to be rejected at any later step. We show that this relaxation allows us to define -- what we justify extensively to be -- the natural and appropriate online extension of the BH and e-BH procedures. We show that the FDR guarantees for BH (resp. e-BH) and online BH (resp. online e-BH) are identical under positive, negative or arbitrary dependence, at fixed and stopping times. Further, the online BH (resp. online e-BH) rule recovers the BH (resp. e-BH) rule as a special case when the number of hypotheses is known to be fixed. Of independent interest, our proof techniques also allow us to prove that numerous existing online procedures, which were known to control the FDR at fixed times, also control the FDR at stopping times.
Paper Structure (32 sections, 25 theorems, 66 equations, 6 figures, 3 tables)

This paper contains 32 sections, 25 theorems, 66 equations, 6 figures, 3 tables.

Key Result

Proposition 2.1

For a stream of infinite hypotheses and its corresponding arbitrarily dependent e-values $(E_t)_{t \in \mathbb{N}}$, $\mathbb{E}\left[\sup_{R \in \mathcal{R}(\alpha)} \textnormal{FDP}(R)\right] \leq \alpha$.

Figures (6)

  • Figure 1: Power comparison of online e-BH and e-LOND for different proportions of false hypotheses. In the left (right) plot, the signal of the alternative is weak (strong).
  • Figure 2: Power comparison of online e-BH with non-boosted and boosted e-values for different proportions of false hypotheses. In the left (right) plot, the signal of the alternative is weak (strong). The simulation setup is described in Appendix \ref{['sec:sim_setup']}.
  • Figure 3: Power comparison of online e-BH with boosted and boosted e-values under local dependence for different proportions of false hypotheses. In the left (right) plot the signal of the alternative is weak (strong). The simulation setup is described in Section \ref{['sec:sim_setup']}.
  • Figure 4: Power comparison of online BH and LORD for different proportions of false hypotheses. In the left plot the sequence $(\gamma_t)_{t\in \mathbb{N}}$ decreases fast ($q=0.99$) and in the left plot it decreases slow ($q=0.999$). The simulation setup is described in Section \ref{['sec:LORD']}.
  • Figure 5: Power comparison of online SBH and SAFFRON for different proportions of false hypotheses. In the left plot the sequence $(\gamma_t)_{t\in \mathbb{N}}$ decreases fast ($q=0.99$) and in the left plot it decreases slow ($q=0.999$). The simulation setup is described in Section \ref{['appn:SBH']}.
  • ...and 1 more figures

Theorems & Definitions (51)

  • Definition 1
  • Definition 2
  • Proposition 2.1
  • proof
  • Theorem 2.2
  • Proposition 2.3
  • Remark 1
  • Proposition 2.4
  • Proposition 2.5
  • Remark 2
  • ...and 41 more