Table of Contents
Fetching ...

The piranha problem: Large effects swimming in a small pond

Christopher Tosh, Philip Greengard, Ben Goodrich, Andrew Gelman, Aki Vehtari, Daniel Hsu

Abstract

In some scientific fields, it is common to have certain variables of interest that are of particular importance and for which there are many studies indicating a relationship with different explanatory variables. In such cases, particularly those where no relationships are known among the explanatory variables, it is worth asking under what conditions it is possible for all such claimed effects to exist simultaneously. This paper addresses this question by reviewing some theorems from multivariate analysis showing that, unless the explanatory variables also have sizable dependencies with each other, it is impossible to have many such large effects. We discuss implications for the replication crisis in social science.

The piranha problem: Large effects swimming in a small pond

Abstract

In some scientific fields, it is common to have certain variables of interest that are of particular importance and for which there are many studies indicating a relationship with different explanatory variables. In such cases, particularly those where no relationships are known among the explanatory variables, it is worth asking under what conditions it is possible for all such claimed effects to exist simultaneously. This paper addresses this question by reviewing some theorems from multivariate analysis showing that, unless the explanatory variables also have sizable dependencies with each other, it is impossible to have many such large effects. We discuss implications for the replication crisis in social science.

Paper Structure

This paper contains 14 sections, 9 theorems, 45 equations.

Key Result

Theorem 1

If $X_1, \dotsc, X_p, y$ are real-valued random variables with finite nonzero variance, then In particular, if $|\operatorname{corr}(X_i,y)| \geq \tau$ for each $i=1,\dotsc, p$, then $\sum_{i \neq j} |\operatorname{corr}(X_i,X_j)| \geq p(\tau^2 p - 1)$.

Theorems & Definitions (17)

  • Theorem 1: Van der Corput's inequality
  • proof
  • Corollary 2
  • Theorem 3
  • Lemma 4
  • proof
  • proof : Proof of Theorem \ref{['thm: correlation eigenvalue piranha theorem']}
  • Theorem 5
  • proof : Proof of Theorem \ref{['thm: regression piranha theorem']}
  • Theorem 6
  • ...and 7 more