Confidence as Forecast: A Decision-Theoretic Interpretation of Confidence Intervals

Scott Lee

Confidence as Forecast: A Decision-Theoretic Interpretation of Confidence Intervals

Scott Lee

TL;DR

The paper reframes confidence intervals as probabilistic forecasts of coverage, treating the ex ante design-level success probability as $1-\alpha$ and introducing a predictive probability for empirical coverage that can be updated with information via strictly proper scoring rules. It demonstrates that, in standard unbounded translation-invariant models, the constant forecast $1-\alpha$ is optimal, while in finite-window or bounded-parameter designs, theta-free statistics permit informative post-trial refinements such as $P(\theta\in I(X)\mid T(X))$. Through thought experiments like a Monty Hall-style shell game and the lost submarine example, the author shows how this forecasting view resolves interpretational puzzles about CIs without invoking priors. The work provides a practical guide for applying this perspective in practice and highlights implications for teaching CI theory as a forecasting tool with clear long-run coverage interpretation. Overall, it offers a coherent frequentist framework that connects design-based guarantees with data-driven predictive updates to interpret and use confidence intervals more coherently.

Abstract

What, if anything, should a frequentist say about a single realized confidence interval (CI) and its chance of having covered the parameter? Jerzy Neyman's original answer was to refuse any nondegenerate probability for coverage ex post and, instead, to "state that the interval covers". In this paper I argue that the usual frequentist machinery already supports a different reading. I treat the coverage event as a Bernoulli random variable, with the nominal level 1-alpha as its design-based success probability, and view "confidence" as a probability forecast for that Bernoulli outcome. Using strictly proper scoring rules, I show that 1-alpha is the unique optimal constant forecast for coverage, both before and after observing the data, and that it remains optimal post-trial in common unbounded, translation-invariant models with pivot-based CIs. When the design yields a theta-free statistic--such as the relative width of the interval in a finite-window uniform model--the conditional coverage given that statistic provides a nonconstant, design-based refinement of 1-alpha that strictly improves predictive performance. Two thought experiments, a Monty Hall-style shell game and the "lost submarine" example of Morey et al. (2016), illustrate how this perspective resolves familiar interpretational puzzles about CIs without appealing to priors or single-case subjective degrees of belief. I conclude with simple "what to do when you see an interval" guidance for applied work and some implications for teaching confidence intervals as tools for forecasting long-run coverage. Keywords: Confidence intervals, coverage probability, proper scoring rules, probabilistic forecasting, frequentist inference Disclaimer: The findings and conclusions in this report are those of the author and do not necessarily represent the official position of the Centers for Disease Control and Prevention

Confidence as Forecast: A Decision-Theoretic Interpretation of Confidence Intervals

TL;DR

The paper reframes confidence intervals as probabilistic forecasts of coverage, treating the ex ante design-level success probability as

and introducing a predictive probability for empirical coverage that can be updated with information via strictly proper scoring rules. It demonstrates that, in standard unbounded translation-invariant models, the constant forecast

is optimal, while in finite-window or bounded-parameter designs, theta-free statistics permit informative post-trial refinements such as

. Through thought experiments like a Monty Hall-style shell game and the lost submarine example, the author shows how this forecasting view resolves interpretational puzzles about CIs without invoking priors. The work provides a practical guide for applying this perspective in practice and highlights implications for teaching CI theory as a forecasting tool with clear long-run coverage interpretation. Overall, it offers a coherent frequentist framework that connects design-based guarantees with data-driven predictive updates to interpret and use confidence intervals more coherently.

Abstract

Paper Structure (30 sections, 1 theorem, 47 equations, 2 tables)

This paper contains 30 sections, 1 theorem, 47 equations, 2 tables.

1 Introduction
1.1 Background
1.2 Paper overview
2. 1 The setup
2.1 The parallels with CIs
3 Neyman through the looking-glass
3.1 What the model gives us
3.2 Forecasting with confidence
3.3.1 Minimizing risk pre-trial
3.3.2 Minimizing risk post-trial
3.3.3 Design-based refinement via theta-free conditional coverage
3.3.4 Defaulting to the confidence level
3.4 Framework summary
3.4.1 Unbounded location–scale models
3.4.2 Finite-window designs
...and 15 more sections

Key Result

Theorem 2.1

Assume there exists a measurable function $g : \mathrm{range}(T) \to [0,1]$ such that, for all $\theta \in \Theta$, Then the forecast rule uniquely minimizes the conditional expected score for every $\theta \in \Theta$ and almost every realization of $T(X)$. In particular, for all $\theta \in \Theta$ and all $\mathcal{G}$-measurable $q(X)$, with equality for a given $\theta$ only if $q(X) = q^

Theorems & Definitions (1)

Theorem 2.1: Design-based optimal forecast with a $\theta$-free statistic

Confidence as Forecast: A Decision-Theoretic Interpretation of Confidence Intervals

TL;DR

Abstract

Confidence as Forecast: A Decision-Theoretic Interpretation of Confidence Intervals

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (1)