Confidence as Forecast: A Decision-Theoretic Interpretation of Confidence Intervals
Scott Lee
TL;DR
The paper reframes confidence intervals as probabilistic forecasts of coverage, treating the ex ante design-level success probability as $1-\alpha$ and introducing a predictive probability for empirical coverage that can be updated with information via strictly proper scoring rules. It demonstrates that, in standard unbounded translation-invariant models, the constant forecast $1-\alpha$ is optimal, while in finite-window or bounded-parameter designs, theta-free statistics permit informative post-trial refinements such as $P(\theta\in I(X)\mid T(X))$. Through thought experiments like a Monty Hall-style shell game and the lost submarine example, the author shows how this forecasting view resolves interpretational puzzles about CIs without invoking priors. The work provides a practical guide for applying this perspective in practice and highlights implications for teaching CI theory as a forecasting tool with clear long-run coverage interpretation. Overall, it offers a coherent frequentist framework that connects design-based guarantees with data-driven predictive updates to interpret and use confidence intervals more coherently.
Abstract
What, if anything, should a frequentist say about a single realized confidence interval (CI) and its chance of having covered the parameter? Jerzy Neyman's original answer was to refuse any nondegenerate probability for coverage ex post and, instead, to "state that the interval covers". In this paper I argue that the usual frequentist machinery already supports a different reading. I treat the coverage event as a Bernoulli random variable, with the nominal level 1-alpha as its design-based success probability, and view "confidence" as a probability forecast for that Bernoulli outcome. Using strictly proper scoring rules, I show that 1-alpha is the unique optimal constant forecast for coverage, both before and after observing the data, and that it remains optimal post-trial in common unbounded, translation-invariant models with pivot-based CIs. When the design yields a theta-free statistic--such as the relative width of the interval in a finite-window uniform model--the conditional coverage given that statistic provides a nonconstant, design-based refinement of 1-alpha that strictly improves predictive performance. Two thought experiments, a Monty Hall-style shell game and the "lost submarine" example of Morey et al. (2016), illustrate how this perspective resolves familiar interpretational puzzles about CIs without appealing to priors or single-case subjective degrees of belief. I conclude with simple "what to do when you see an interval" guidance for applied work and some implications for teaching confidence intervals as tools for forecasting long-run coverage. Keywords: Confidence intervals, coverage probability, proper scoring rules, probabilistic forecasting, frequentist inference Disclaimer: The findings and conclusions in this report are those of the author and do not necessarily represent the official position of the Centers for Disease Control and Prevention
