A fast algorithm to compute a curve of confidence upper bounds for the False Discovery Proportion using a reference family with a forest structure

Guillermo Durand

A fast algorithm to compute a curve of confidence upper bounds for the False Discovery Proportion using a reference family with a forest structure

Guillermo Durand

TL;DR

The paper tackles the challenge of computing a full curve of post hoc FDP bounds $V^*_{\mathfrak{R}}(S_t)$ along a nested sequence of hypothesis sets, by exploiting forest-structured reference families. It introduces pruning of the forest and, most importantly, a fast $O(m|\mathcal{K}|)$ curve-computation algorithm that maintains per-region counters and a partition to update the curve efficiently; the core identity $V^*_{\mathfrak{R}}(S_t)=\sum_{k\in\mathcal{P}^t} \zeta_k \wedge |S_t\cap R_k|$ enables linear-time progression along the path. The authors provide a rigorous proof framework for the curve updates, implement the methods in the RR-base package, and demonstrate substantial speedups via numerical experiments across large-scale scenarios. This work significantly improves the practicality of extensive FDP-bound exploration in high-dimensional multiple testing, enabling exact curve computation and broader empirical study of post hoc inference strategies.

Abstract

This paper presents a new algorithm (and an additional trick) that allows to compute fastly an entire curve of post hoc bounds for the False Discovery Proportion when the underlying bound $V^*_{\mathfrak{R}}$ construction is based on a reference family $\mathfrak{R}$ with a forest structure {à} la Durand et al. (2020). By an entire curve, we mean the values $V^*_{\mathfrak{R}}(S_1),\dotsc,V^*_{\mathfrak{R}}(S_m)$ computed on a path of increasing selection sets $S_1\subsetneq\dotsb\subsetneq S_m$, $|S_t|=t$. The new algorithm leverages the fact that going from $S_t$ to $S_{t+1}$ is done by adding only one hypothesis.

A fast algorithm to compute a curve of confidence upper bounds for the False Discovery Proportion using a reference family with a forest structure

TL;DR

The paper tackles the challenge of computing a full curve of post hoc FDP bounds

along a nested sequence of hypothesis sets, by exploiting forest-structured reference families. It introduces pruning of the forest and, most importantly, a fast

curve-computation algorithm that maintains per-region counters and a partition to update the curve efficiently; the core identity

enables linear-time progression along the path. The authors provide a rigorous proof framework for the curve updates, implement the methods in the RR-base package, and demonstrate substantial speedups via numerical experiments across large-scale scenarios. This work significantly improves the practicality of extensive FDP-bound exploration in high-dimensional multiple testing, enabling exact curve computation and broader empirical study of post hoc inference strategies.

Abstract

This paper presents a new algorithm (and an additional trick) that allows to compute fastly an entire curve of post hoc bounds for the False Discovery Proportion when the underlying bound

construction is based on a reference family

with a forest structure {à} la Durand et al. (2020). By an entire curve, we mean the values

computed on a path of increasing selection sets

. The new algorithm leverages the fact that going from

is done by adding only one hypothesis.

A fast algorithm to compute a curve of confidence upper bounds for the False Discovery Proportion using a reference family with a forest structure

TL;DR

Abstract

A fast algorithm to compute a curve of confidence upper bounds for the False Discovery Proportion using a reference family with a forest structure

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (14)