Table of Contents
Fetching ...

What the Jeffreys-Lindley Paradox Really Is: Correcting a Persistent Misconception

Miodrag M. Lovric

TL;DR

The paper clarifies a long-standing confusion between the Jeffreys–Lindley paradox and Bartlett's Anomaly, showing that Lindley's paradox concerns sample-size asymptotics with a fixed p-value, while Bartlett's Anomaly concerns prior diffuseness. It argues that Robert (1993) conflated the two phenomena and that the paradox does not vanish with prior tweaks alone. The proposed resolution is to replace point null hypotheses with interval nulls, which yields concordant Bayesian and frequentist conclusions in the regime of practically negligible effects. The work underscores the practical relevance of the paradox in modern large-scale data and advocates a shift toward interval-based hypotheses to restore coherence across inferential frameworks.

Abstract

The Jeffreys-Lindley paradox stands as the most profound divergence between frequentist and Bayesian approaches to hypothesis testing. Yet despite more than six decades of discussion, this paradox remains frequently misunderstood--even in the pages of leading statistical journals. In a 1993 paper published in Statistica Sinica, Robert characterized the Jeffreys-Lindley paradox as "the fact that a point null hypothesis will always be accepted when the variance of a conjugate prior goes to infinity." This characterization, however, describes a different phenomenon entirely-what we term Bartlett's Anomaly-rather than the Jeffreys-Lindley paradox as originally formulated. The paradox, as presented by Lindley (1957), concerns what happens as sample size increases without bound while holding the significance level fixed, not what happens as prior variance diverges. This distinction is not merely terminological: the two phenomena have different mathematical structures, different implications, and require different solutions. The present paper aims to clarify this confusion, demonstrating through Lindley's own equations that he was concerned exclusively with sample size asymptotics. We show that even Jeffreys himself underestimated the practical frequency of the paradox. Finally, we argue that the only genuine resolution lies in abandoning point null hypotheses in favor of interval nulls, a paradigm shift that eliminates the paradox and restores harmony between Bayesian and frequentist inference. Submitted to Statistica Sinica.

What the Jeffreys-Lindley Paradox Really Is: Correcting a Persistent Misconception

TL;DR

The paper clarifies a long-standing confusion between the Jeffreys–Lindley paradox and Bartlett's Anomaly, showing that Lindley's paradox concerns sample-size asymptotics with a fixed p-value, while Bartlett's Anomaly concerns prior diffuseness. It argues that Robert (1993) conflated the two phenomena and that the paradox does not vanish with prior tweaks alone. The proposed resolution is to replace point null hypotheses with interval nulls, which yields concordant Bayesian and frequentist conclusions in the regime of practically negligible effects. The work underscores the practical relevance of the paradox in modern large-scale data and advocates a shift toward interval-based hypotheses to restore coherence across inferential frameworks.

Abstract

The Jeffreys-Lindley paradox stands as the most profound divergence between frequentist and Bayesian approaches to hypothesis testing. Yet despite more than six decades of discussion, this paradox remains frequently misunderstood--even in the pages of leading statistical journals. In a 1993 paper published in Statistica Sinica, Robert characterized the Jeffreys-Lindley paradox as "the fact that a point null hypothesis will always be accepted when the variance of a conjugate prior goes to infinity." This characterization, however, describes a different phenomenon entirely-what we term Bartlett's Anomaly-rather than the Jeffreys-Lindley paradox as originally formulated. The paradox, as presented by Lindley (1957), concerns what happens as sample size increases without bound while holding the significance level fixed, not what happens as prior variance diverges. This distinction is not merely terminological: the two phenomena have different mathematical structures, different implications, and require different solutions. The present paper aims to clarify this confusion, demonstrating through Lindley's own equations that he was concerned exclusively with sample size asymptotics. We show that even Jeffreys himself underestimated the practical frequency of the paradox. Finally, we argue that the only genuine resolution lies in abandoning point null hypotheses in favor of interval nulls, a paradigm shift that eliminates the paradox and restores harmony between Bayesian and frequentist inference. Submitted to Statistica Sinica.

Paper Structure

This paper contains 11 sections, 17 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Visual comparison of two distinct phenomena, computed for a normal model with known variance. (A) The Jeffreys--Lindley paradox (Lindley, 1957) occurs when sample size $n \to \infty$ with fixed prior and fixed $p$-value. Here $z = 1.96$ (corresponding to $\alpha = 0.05$), prior scale $\tau = \sigma_0/\sigma = 1$ (a standard "unit information" choice), and prior probability $\pi_0 = 0.5$ (equal prior odds). Note that $\bar{x}$ approaches $\theta_0$ as $n \to \infty$ to maintain the fixed-$z$ constraint; the observed effect shrinks but remains "significant." The posterior probability $P(H_0|\bar{x})$ increases toward 1 as $n$ grows, creating "strong contrast" with the frequentist rejection. (B) Bartlett's Anomaly (Bartlett, 1957) occurs when prior variance increases with fixed data and fixed sample size. Here $n = 100$, $z = 2.5$ (yielding $p \approx 0.012$), and $\pi_0 = 0.5$; the data $\bar{x}$ remain unchanged throughout. The horizontal axis shows prior scale $\tau = \sigma_0/\sigma$ (so prior variance is $\tau^2\sigma^2$). Even data yielding $p = 0.012$ leads to $P(H_0|\bar{x}) \to 1$ as the prior becomes diffuse. These are fundamentally different phenomena with different drivers: likelihood concentration (A) versus prior diffuseness (B).