Table of Contents
Fetching ...

Improving online FDR procedures via online analogs of e-closure and compound e-values

Ziyu Xu, Lasse Fischer, Aaditya Ramdas

Abstract

In many scientific applications, hypotheses are generated and tested continuously in a stream. We develop a framework for improving online multiple testing procedures with false discovery rate (FDR) control under arbitrary dependence. Our approach is two-fold: we construct methods via the online e-closure principle, as well as a novel formulation of online compound e-values that is defined through donations. This yields strict power improvements over state-of-the-art e-value and p-value procedures while retaining FDR control. We further derive algorithms that compute the decision at time $t$ in $O(\log t)$ time, and we demonstrate improved empirical performance on synthetic and real data.

Improving online FDR procedures via online analogs of e-closure and compound e-values

Abstract

In many scientific applications, hypotheses are generated and tested continuously in a stream. We develop a framework for improving online multiple testing procedures with false discovery rate (FDR) control under arbitrary dependence. Our approach is two-fold: we construct methods via the online e-closure principle, as well as a novel formulation of online compound e-values that is defined through donations. This yields strict power improvements over state-of-the-art e-value and p-value procedures while retaining FDR control. We further derive algorithms that compute the decision at time in time, and we demonstrate improved empirical performance on synthetic and real data.

Paper Structure

This paper contains 45 sections, 15 theorems, 92 equations, 10 figures, 2 tables.

Key Result

Theorem 1

Let $(E_S)_{S \in 2^\mathbb{N}}$ be an increasing e-collection. Assume $E_S$ is measurable with respect to $\mathcal{F}_{\sup(S)}$ for all finite nonempty $S$. Then the associated e-closure collections $(\mathcal{C}_t)_{t \in \mathbb{N}}$ in eq:online-eclosure-col form an online procedure that satis Consequently, any discovery sequence $\mathbf{R}$ with $R_t \in \mathcal{C}_t$ for all $t \in \math

Figures (10)

  • Figure 1: Summary of the paper's main technical contributions.
  • Figure 2: Local dependence simulation summary over 200 trials. The left column shows e-value procedures and the right column shows p-value procedures. The top row reports power as the non-null fraction $\pi_1$ increases (with $\mu = 3$ and $\delta = 0.1$), while the bottom row reports mean wall-clock runtime (log scale) as the number of hypotheses increases. Donation and closed variants improve power over the corresponding baselines, and donation variants remain computationally practical compared with closed variants.
  • Figure 3: Plots of discoveries made by e-value procedures along with the accompanying real dataset.
  • Figure 4: Power comparison for alternative choice of $\gamma$ for $\overline{\textnormal{e-LOND}}$.
  • Figure 5: FDR for the alternative-$\gamma$ closure comparison. All methods stay controlled at the target level $\delta = 0.1$.
  • ...and 5 more figures

Theorems & Definitions (30)

  • Theorem 1: Online SupFDR e-closure
  • Theorem 2
  • Remark 3
  • Definition 4: $\boldsymbol{\gamma}$-online compound e-values and $\boldsymbol{\gamma}$-weighted donations
  • Proposition 5
  • Proposition 6
  • Proposition 7
  • Theorem 8
  • Theorem 9
  • Theorem 10
  • ...and 20 more