Sensitivity Analysis for False Discovery Rate Estimation with Published p-Values

Tianyu Cao; Sangyoon Yi; Joshua Habiger

Sensitivity Analysis for False Discovery Rate Estimation with Published p-Values

Tianyu Cao, Sangyoon Yi, Joshua Habiger

Abstract

There is recent interest in estimating the false discovery rate (FDR) with published p-values. However, there is little formal research that addresses the manner and extent to which the presumed selection, or publication, bias model impacts the bias and variance of FDR estimators. This manuscript provides general and closed-form expressions for the bias and variance of an established FDR estimator when the publication bias model (p<0.05) may or may not be correct. Expressions reveal that FDR estimates could be conservative or liberal, depending on how well a $p<0.05$ publication rule approximates the true selection mechanism. Analysis of a well-studied large-scale replication project in psychology, where selection model parameters are estimable, suggests that bias expressions are accurate in practice. Another well-studied collection of p-values mined from medical journal abstracts is used to illustrate how provided closed-form expressions may facilitate a simple sensitivity analysis when the goal is FDR estimation using selected p-values with unknown selection mechanism.

Sensitivity Analysis for False Discovery Rate Estimation with Published p-Values

Abstract

publication rule approximates the true selection mechanism. Analysis of a well-studied large-scale replication project in psychology, where selection model parameters are estimable, suggests that bias expressions are accurate in practice. Another well-studied collection of p-values mined from medical journal abstracts is used to illustrate how provided closed-form expressions may facilitate a simple sensitivity analysis when the goal is FDR estimation using selected p-values with unknown selection mechanism.

Paper Structure (6 sections, 5 theorems, 36 equations, 2 figures, 2 tables)

This paper contains 6 sections, 5 theorems, 36 equations, 2 figures, 2 tables.

Introduction
Models
Post-Selection False Discovery Rate
Bias and Variance Expressions
Application
Conclusion

Key Result

Theorem 3.2

Assume $H_i=0$ is rejected if $P_i\leq \alpha$ for some $\alpha \in (0,1)$ and let $\delta_i = 1$ denote the event that $P_i$ is selected. Under models in Definitions MM-def:PPM and the FDR in Definition def:FDR,

Figures (2)

Figure 1: Bias in \ref{['cstep']} over $\rho$ and power (corresponding to $\gamma$) at $\lambda = 0.01,0.025,0.045$ and $\pi_{0}=0.3,0.5,0.8$ when $\alpha=0.05$
Figure 2: Bias in \ref{['eq:biasbeta']} over $\eta$ and power (corresponding to $\gamma$) at $\lambda = 0.01,0.025,0.045$ and $\pi_{0}=0.3,0.5,0.8$ when $\alpha=0.05$

Theorems & Definitions (14)

Definition 2.1: P-value Mixture Model
Definition 2.2: Selection Probability Model
Definition 3.1: Post-Selection False Discovery Rate
Theorem 3.2
Definition 3.3: Post-Selection FDR Estimator
Theorem 4.1
Theorem 4.2
Remark 1
Remark 2
proof : Proof of Theorem \ref{['bayes']}
...and 4 more

Sensitivity Analysis for False Discovery Rate Estimation with Published p-Values

Abstract

Sensitivity Analysis for False Discovery Rate Estimation with Published p-Values

Authors

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (14)