Table of Contents
Fetching ...

Dependence-Aware False Discovery Rate Control in Two-Sided Gaussian Mean Testing

Deepra Ghosh, Sanat K. Sarkar

TL;DR

The paper addresses the lack of theoretical FDR guarantees for BH-type procedures in two-sided Gaussian mean testing under dependence by introducing positive left-tail dependence under the null (PLTDN). It then develops a broad Generalized Shifted BH (GSBH) framework that leverages p-value shifts with a tunable parameter to exploit correlation, yielding exact or nonasymptotic FDR control and substantial power gains across diverse dependence structures. The framework extends to regression-based variable selection via Shifted BBH (SBBH) and knockoff-assisted settings, with strong empirical support from simulations and an HIV dataset where GSBH methods reliably control FDP while identifying meaningful signals. Overall, the work provides a rigorous, practically implementable approach to FDR control under dependence for two-sided Gaussian testing and related structured inference tasks.

Abstract

This paper develops a general framework for controlling the false discovery rate (FDR) in multiple testing of Gaussian means against two-sided alternatives. The widely used Benjamini-Hochberg (BH) procedure provides exact FDR control under independence or conservative control under specific one-sided dependence structures, but its validity for correlated two-sided tests has remained an open question. We introduce the notion of positive left-tail dependence under the null (PLTDN), extending classical dependence assumptions to two-sided settings, and show that it ensures valid FDR control for BH-type procedures. Building on this framework, we propose a family of generalized shifted BH (GSBH) methods that incorporate correlation information through simple p-value adjustments. Simulation results demonstrate reliable FDR control and improved power across a range of dependence structures, while an application to an HIV gene expression dataset illustrates the practical effectiveness of the proposed approach.

Dependence-Aware False Discovery Rate Control in Two-Sided Gaussian Mean Testing

TL;DR

The paper addresses the lack of theoretical FDR guarantees for BH-type procedures in two-sided Gaussian mean testing under dependence by introducing positive left-tail dependence under the null (PLTDN). It then develops a broad Generalized Shifted BH (GSBH) framework that leverages p-value shifts with a tunable parameter to exploit correlation, yielding exact or nonasymptotic FDR control and substantial power gains across diverse dependence structures. The framework extends to regression-based variable selection via Shifted BBH (SBBH) and knockoff-assisted settings, with strong empirical support from simulations and an HIV dataset where GSBH methods reliably control FDP while identifying meaningful signals. Overall, the work provides a rigorous, practically implementable approach to FDR control under dependence for two-sided Gaussian testing and related structured inference tasks.

Abstract

This paper develops a general framework for controlling the false discovery rate (FDR) in multiple testing of Gaussian means against two-sided alternatives. The widely used Benjamini-Hochberg (BH) procedure provides exact FDR control under independence or conservative control under specific one-sided dependence structures, but its validity for correlated two-sided tests has remained an open question. We introduce the notion of positive left-tail dependence under the null (PLTDN), extending classical dependence assumptions to two-sided settings, and show that it ensures valid FDR control for BH-type procedures. Building on this framework, we propose a family of generalized shifted BH (GSBH) methods that incorporate correlation information through simple p-value adjustments. Simulation results demonstrate reliable FDR control and improved power across a range of dependence structures, while an application to an HIV gene expression dataset illustrates the practical effectiveness of the proposed approach.

Paper Structure

This paper contains 13 sections, 2 theorems, 23 equations, 25 figures.

Key Result

Lemma 1

Let $I_0 = \{j: H_j\; \hbox{is true} \}$. Then, where ${0}/{0} = 0$, ${\boldsymbol{P}}_{-i} = ({P}_1, \ldots, {P}_d)\setminus\{{P}_i\}$, and $R_{-i}\equiv R({\boldsymbol{P}}_{-i}) = \max_{1\le j \le d-1}\{j:{P}_{(j)\setminus\{i\}} \le \alpha_{j+1} \},$ with ${P}_{(1)\setminus\{i\}} \le \cdots \le {P}_{(d-1)\setminus\{i\}}$ being the ordered compo

Figures (25)

  • Figure 1: Simulated Power (left column), simulated FDR (middle column) for fixed null proportion and simulated power (right column) for fixed signal strength, displayed for mean testing of $d=40$ parameters. Methods compared are SBH1 method (Circle and black), SBH2 method (Triangle point up and red), GSBH1 method (Plus and green), GSBH2 (Cross and blue), GSBH3 (Diamond and light blue), GSBH4 (Triangle point down and purple), GSBH5 (Square cross and yellow), GSBH6 (Star and grey), BH (Diamond plus and black), BY (Circle plus and red), dBH (Triangles up and down and green) and dBY (Square plus and blue)
  • Figure 2: Simulated Power (left column), simulated FDR (middle column) for fixed null proportion and simulated power (right column) for fixed signal strength, displayed for mean testing of $d=40$ parameters. Methods compared are SBH1 method (Circle and black), SBH2 method (Triangle point up and red), GSBH1 method (Plus and green), GSBH2 (Cross and blue), GSBH3 (Diamond and light blue), GSBH4 (Triangle point down and purple), GSBH5 (Square cross and yellow), GSBH6 (Star and grey), BH (Diamond plus and black), BY (Circle plus and red), dBH (Triangles up and down and green) and dBY (Square plus and blue)
  • Figure 3: Simulated Power (left column), simulated FDR (middle column) for fixed null proportion and simulated power (right column) for fixed signal strength, displayed for variable selection from $d=40$ parameters. Methods compared are SBH1 method (Circle and black), SBH2 method (Triangle point up and red), GSBH1 method (Plus and green), GSBH2 (Cross and blue), GSBH3 (Diamond and light blue), GSBH4 (Triangle point down and purple), GSBH5 (Square cross and yellow), GSBH6 (Star and grey), BH (Diamond plus and black), BY (Circle plus and red), dBH (Triangles up and down and green) and dBY (Square plus and blue)
  • Figure 4: Simulated Power (left column) and simulated FDR (right column) displayed for knockoff-assisted variable selection from $d=40$ parameters at a level of $\alpha=0.2$. Methods compared are BH method (Circle and black), BBH method (Triangle point up and red), Adapt-BBH method (Plus and green), SBBH1 (Cross and blue), SBBH2 (Diamond and light blue), SBBH3 (Triangle point down and purple), SBBH4 (Square cross and yellow), SBBH5 (Star and grey), Knockoff (Diamond plus and black), Rev-BBH (Circle plus and red) and BBY (Triangles up and down and green)
  • Figure 5: Selected results of drug resistance for $\alpha = 0.05$: (a) resistance to IDV; (b) resistance to NFV; (c) resistance to SQV. Dark blue indicates protease positions that appear in the treatment-selected mutation (TSM) panel for the PI class of treatments, while orange indicates positions selected by the method that do not appear in the TSM list.
  • ...and 20 more figures

Theorems & Definitions (6)

  • Lemma 1
  • Remark 1
  • Lemma 2
  • Definition 1
  • Remark 2
  • Definition 2