Table of Contents
Fetching ...

Applied Statistics Requires Scientific Context

Ashley I Naimi

Abstract

Statistical methods are indispensable to scientific inference. However, there exists a longstanding tension across a wide range of scientific disciplines about the role that ``context'' should play in the application of statistical methods and the interpretation of statistical results. Though frequently invoked, the notion of ``scientific context'' refers to at least two distinct concepts: a set of foundational nuanced and elusive background assumptions and substantive features of a given area of study that shape the validity and reliability of statistical methods; and more quantifiable contextual issues that affect the performance of statistical methods and interpretation of statistical results. I argue here that the application and interpretation of statistical methods requires careful consideration of foundational contextual issues. To motivate the arguments, I review a recent re-formulation of the $p$-value as a measure of divergence between an observed dataset and a set of assumptions used to construct statistical measures. I use this framework to illustrate the role that context plays in two randomized trials: on low-dose aspirin for pregnancy loss, and a new inhibitor of a key biochemical pathway affecting ankylosing spondylitis. Finally, I note that the adoption of low significance thresholds in genome-wide association studies and high energy particle physics has been successful more so because of extensive validity-checking gauntlets and contextual considerations that have accompanied these low thresholds, not because of the low thresholds themselves. I use these illustrations and arguments to suggest that (i) the adoption of a universal threshold for significance testing should be abandoned as a goal of statistics reform; and (ii) the validity and optimal use of applied statistical tools requires careful consideration of nuanced scientific context.

Applied Statistics Requires Scientific Context

Abstract

Statistical methods are indispensable to scientific inference. However, there exists a longstanding tension across a wide range of scientific disciplines about the role that ``context'' should play in the application of statistical methods and the interpretation of statistical results. Though frequently invoked, the notion of ``scientific context'' refers to at least two distinct concepts: a set of foundational nuanced and elusive background assumptions and substantive features of a given area of study that shape the validity and reliability of statistical methods; and more quantifiable contextual issues that affect the performance of statistical methods and interpretation of statistical results. I argue here that the application and interpretation of statistical methods requires careful consideration of foundational contextual issues. To motivate the arguments, I review a recent re-formulation of the -value as a measure of divergence between an observed dataset and a set of assumptions used to construct statistical measures. I use this framework to illustrate the role that context plays in two randomized trials: on low-dose aspirin for pregnancy loss, and a new inhibitor of a key biochemical pathway affecting ankylosing spondylitis. Finally, I note that the adoption of low significance thresholds in genome-wide association studies and high energy particle physics has been successful more so because of extensive validity-checking gauntlets and contextual considerations that have accompanied these low thresholds, not because of the low thresholds themselves. I use these illustrations and arguments to suggest that (i) the adoption of a universal threshold for significance testing should be abandoned as a goal of statistics reform; and (ii) the validity and optimal use of applied statistical tools requires careful consideration of nuanced scientific context.

Paper Structure

This paper contains 10 sections, 3 equations, 1 figure.

Figures (1)

  • Figure 1: Geometric interpretation of the $p$-value. The left panel shows the observed data point $z = (\bar{Y}_1, \bar{Y}_0)$ and its orthogonal projection onto the model manifold $M$ represented by the solid diagonal line, yielding an empirical measure of discrepancy between $z$ and $M$, denoted $d(z;M)$ and indexed by the dashed line. The right panel shows the reference $\chi^2_1$ distribution of $T$, the variance standardized measure of $d(z; M)$, under $M$, with the shaded upper-tail region corresponding to the $p$-value $p = \Pr(\chi^2_1 \ge T)$. Together, these panels illustrate the $p$-value as a quantile location of the observed data-model divergence.