Table of Contents
Fetching ...

Sensitivity Analysis for False Discovery Rate Estimation with Published p-Values

Tianyu Cao, Sangyoon Yi, Joshua Habiger

Abstract

There is recent interest in estimating the false discovery rate (FDR) with published p-values. However, there is little formal research that addresses the manner and extent to which the presumed selection, or publication, bias model impacts the bias and variance of FDR estimators. This manuscript provides general and closed-form expressions for the bias and variance of an established FDR estimator when the publication bias model (p<0.05) may or may not be correct. Expressions reveal that FDR estimates could be conservative or liberal, depending on how well a $p<0.05$ publication rule approximates the true selection mechanism. Analysis of a well-studied large-scale replication project in psychology, where selection model parameters are estimable, suggests that bias expressions are accurate in practice. Another well-studied collection of p-values mined from medical journal abstracts is used to illustrate how provided closed-form expressions may facilitate a simple sensitivity analysis when the goal is FDR estimation using selected p-values with unknown selection mechanism.

Sensitivity Analysis for False Discovery Rate Estimation with Published p-Values

Abstract

There is recent interest in estimating the false discovery rate (FDR) with published p-values. However, there is little formal research that addresses the manner and extent to which the presumed selection, or publication, bias model impacts the bias and variance of FDR estimators. This manuscript provides general and closed-form expressions for the bias and variance of an established FDR estimator when the publication bias model (p<0.05) may or may not be correct. Expressions reveal that FDR estimates could be conservative or liberal, depending on how well a publication rule approximates the true selection mechanism. Analysis of a well-studied large-scale replication project in psychology, where selection model parameters are estimable, suggests that bias expressions are accurate in practice. Another well-studied collection of p-values mined from medical journal abstracts is used to illustrate how provided closed-form expressions may facilitate a simple sensitivity analysis when the goal is FDR estimation using selected p-values with unknown selection mechanism.
Paper Structure (6 sections, 5 theorems, 36 equations, 2 figures, 2 tables)

This paper contains 6 sections, 5 theorems, 36 equations, 2 figures, 2 tables.

Key Result

Theorem 3.2

Assume $H_i=0$ is rejected if $P_i\leq \alpha$ for some $\alpha \in (0,1)$ and let $\delta_i = 1$ denote the event that $P_i$ is selected. Under models in Definitions MM-def:PPM and the FDR in Definition def:FDR,

Figures (2)

  • Figure 1: Bias in \ref{['cstep']} over $\rho$ and power (corresponding to $\gamma$) at $\lambda = 0.01,0.025,0.045$ and $\pi_{0}=0.3,0.5,0.8$ when $\alpha=0.05$
  • Figure 2: Bias in \ref{['eq:biasbeta']} over $\eta$ and power (corresponding to $\gamma$) at $\lambda = 0.01,0.025,0.045$ and $\pi_{0}=0.3,0.5,0.8$ when $\alpha=0.05$

Theorems & Definitions (14)

  • Definition 2.1: P-value Mixture Model
  • Definition 2.2: Selection Probability Model
  • Definition 3.1: Post-Selection False Discovery Rate
  • Theorem 3.2
  • Definition 3.3: Post-Selection FDR Estimator
  • Theorem 4.1
  • Theorem 4.2
  • Remark 1
  • Remark 2
  • proof : Proof of Theorem \ref{['bayes']}
  • ...and 4 more