Table of Contents
Fetching ...

On the Adversarial Robustness of Benjamini Hochberg

Louis L Chen, Roberto Szechtman, Matan Seri

TL;DR

The paper investigates adversarial perturbations of the Benjamini-Hochberg $FDR$ control in large-scale multiple testing. It introduces a $c$-perturbation attacker and a balls-into-bins reformulation, deriving non-asymptotic guarantees and two attack algorithms, INCREASE-c and MOVE-1. The analysis reveals that BH can be significantly compromised when alternatives are not well separated from the null, with synthetic and conformal p-value experiments (including a credit card fraud task) illustrating practical vulnerability. The work highlights the need for caution in safety/security contexts and motivates robustness studies for BH and related step-up procedures.

Abstract

The Benjamini-Hochberg (BH) procedure is widely used to control the false detection rate (FDR) in multiple testing. Applications of this control abound in drug discovery, forensics, anomaly detection, and, in particular, machine learning, ranging from nonparametric outlier detection to out-of-distribution detection and one-class classification methods. Considering this control could be relied upon in critical safety/security contexts, we investigate its adversarial robustness. More precisely, we study under what conditions BH does and does not exhibit adversarial robustness, we present a class of simple and easily implementable adversarial test-perturbation algorithms, and we perform computational experiments. With our algorithms, we demonstrate that there are conditions under which BH's control can be significantly broken with relatively few (even just one) test score perturbation(s), and provide non-asymptotic guarantees on the expected adversarial-adjustment to FDR. Our technical analysis involves a combinatorial reframing of the BH procedure as a ``balls into bins'' process, and drawing a connection to generalized ballot problems to facilitate an information-theoretic approach for deriving non-asymptotic lower bounds.

On the Adversarial Robustness of Benjamini Hochberg

TL;DR

The paper investigates adversarial perturbations of the Benjamini-Hochberg control in large-scale multiple testing. It introduces a -perturbation attacker and a balls-into-bins reformulation, deriving non-asymptotic guarantees and two attack algorithms, INCREASE-c and MOVE-1. The analysis reveals that BH can be significantly compromised when alternatives are not well separated from the null, with synthetic and conformal p-value experiments (including a credit card fraud task) illustrating practical vulnerability. The work highlights the need for caution in safety/security contexts and motivates robustness studies for BH and related step-up procedures.

Abstract

The Benjamini-Hochberg (BH) procedure is widely used to control the false detection rate (FDR) in multiple testing. Applications of this control abound in drug discovery, forensics, anomaly detection, and, in particular, machine learning, ranging from nonparametric outlier detection to out-of-distribution detection and one-class classification methods. Considering this control could be relied upon in critical safety/security contexts, we investigate its adversarial robustness. More precisely, we study under what conditions BH does and does not exhibit adversarial robustness, we present a class of simple and easily implementable adversarial test-perturbation algorithms, and we perform computational experiments. With our algorithms, we demonstrate that there are conditions under which BH's control can be significantly broken with relatively few (even just one) test score perturbation(s), and provide non-asymptotic guarantees on the expected adversarial-adjustment to FDR. Our technical analysis involves a combinatorial reframing of the BH procedure as a ``balls into bins'' process, and drawing a connection to generalized ballot problems to facilitate an information-theoretic approach for deriving non-asymptotic lower bounds.
Paper Structure (28 sections, 21 theorems, 58 equations, 6 figures, 2 tables)

This paper contains 28 sections, 21 theorems, 58 equations, 6 figures, 2 tables.

Key Result

Lemma 1.1

If every null p-value is super-uniform, equiv., $p_i \sim \mathbbm{P}_i^0 \succcurlyeq U(0,1)$ for all $i \in \mathcal{H}_0,$ and the collection is jointly independent, then regardless of the collection of alternative distributions $\{\mathbbm{P}^1_i\}_{i \in \mathcal{H}_1}$,

Figures (6)

  • Figure 1: $10^4$ simulations of FDP by $BH_q$ before and after INCREASE-10 is executed on the p-values. $N = 10^3$, $N_0 = 900$, and $q = 0.10.$
  • Figure 2: Sample Average ($10^4$-batch) estimates of $\mathbbm{E}_zFDP[BH_q; z_{+c}]$ and $\mathbbm{E}\left[\tilde{k}_{+c} - \tilde{k}\right]$ (in parentheses) when all $\{\mu^i_1\}_{i \in \mathcal{H}_1}$ commonly equal some $\mu_1 \in \{0, 1, 2\}$ and $N = 10^3$, $q = 0.10$, $\pi_0 = 0.90$, and all $\sigma^i = 1$.
  • Figure 3: Comparing the FDR increase $\Delta_1$ of INCREASE-1 with the lower bound $L_1$ of Theorem \ref{['thm:: Mu1EqualsZeroBound']} as functions of $q$ when $\mu_1 = 0$, $N = 1000$
  • Figure 4: Comparing the FDR increase $\Delta_1$ of INCREASE-1 with the lower bound $L_{1}$ of Theorem \ref{['thm:: Mu1EqualsZeroBound']} as functions of $q$ when $\mu_1 = .25,$$N = 1000$
  • Figure 5: $10^3$ simulations of FDP by $BH_q$ with and without application of INCREASE-5 on marginal conformal p-values bates2023testing derived from an SVM one-class classifier on a test set with outliers drawn with $a = 1.5$.
  • ...and 1 more figures

Theorems & Definitions (33)

  • Lemma 1.1: Theorem 4.1 from efron2013computer
  • Remark 1.2
  • Theorem 3.1
  • Theorem 3.2
  • Theorem 4.1
  • Corollary 4.1
  • Corollary 4.1
  • Theorem 4.3
  • Remark 4.4
  • Theorem 7.1
  • ...and 23 more