Table of Contents
Fetching ...

Asymptotic optimality theory of confidence intervals of the mean

Vikas Deep, Achal Bassamboo, Sandeep Juneja

TL;DR

This work tackles the problem of constructing confidence intervals for the mean with a guaranteed coverage $1-\delta$ by identifying three asymptotic learning regimes based on the scaling of the target sample size with $\log(1/\delta)$. It proves that no learning occurs when $N_\delta/\log(1/\delta) \to 0$, yields sharp, distribution-dependent lower bounds in the sufficient regime $N_\delta/\log(1/\delta) \to k$, and achieves zero-width intervals in the complete regime $N_\delta/\log(1/\delta) \to \infty$, under a mild stability assumption. The authors show that CIs built by inverting concentration inequalities based on KL divergences are asymptotically optimal in both the sufficient and complete regimes for single-parameter exponential families and certain non-parametric families with bounded support or bounded moments, and extend the framework to one-sided CIs and to settings with random sampling costs where the limiting width depends only on the mean cost. A KL-inf-based extension provides analogous optimal constructions in the non-parametric case, with dual representations enabling practical computation. These results offer a unified, asymptotically optimal approach to CI construction for the mean with broad applicability in simulation, A/B testing, and resource-constrained data collection.

Abstract

We address the classical problem of constructing confidence intervals (CIs) for the mean of a distribution, given \(N\) i.i.d. samples, such that the CI contains the true mean with probability at least \(1 - δ\), where \(δ\in (0,1)\). We characterize three distinct learning regimes based on the minimum achievable limiting width of any CI as the sample size \(N_δ \to \infty\) and \(δ\to 0\). In the first regime, where \(N_δ\) grows slower than \(\log(1/δ)\), the limiting width of any CI equals the width of the distribution's support, precluding meaningful inference. In the second regime, where \(N_δ\) scales as \(\log(1/δ)\), we precisely characterize the minimum limiting width, which depends on the scaling constant. In the third regime, where \(N_δ\) grows faster than \(\log(1/δ)\), complete learning is achievable, and the limiting width of the CI collapses to zero, converging to the true mean. We demonstrate that CIs derived from concentration inequalities based on Kullback--Leibler (KL) divergences achieve asymptotically optimal performance, attaining the minimum limiting width in both sufficient and complete learning regimes for distributions in two families: single-parameter exponential and bounded support. Additionally, these results extend to one-sided CIs, with the width notion adjusted appropriately. Finally, we generalize our findings to settings with random per-sample costs, motivated by practical applications such as stochastic simulators and cloud service selection. Instead of a fixed sample size, we consider a cost budget \(C_δ\), identifying analogous learning regimes and characterizing the optimal CI construction policy.

Asymptotic optimality theory of confidence intervals of the mean

TL;DR

This work tackles the problem of constructing confidence intervals for the mean with a guaranteed coverage by identifying three asymptotic learning regimes based on the scaling of the target sample size with . It proves that no learning occurs when , yields sharp, distribution-dependent lower bounds in the sufficient regime , and achieves zero-width intervals in the complete regime , under a mild stability assumption. The authors show that CIs built by inverting concentration inequalities based on KL divergences are asymptotically optimal in both the sufficient and complete regimes for single-parameter exponential families and certain non-parametric families with bounded support or bounded moments, and extend the framework to one-sided CIs and to settings with random sampling costs where the limiting width depends only on the mean cost. A KL-inf-based extension provides analogous optimal constructions in the non-parametric case, with dual representations enabling practical computation. These results offer a unified, asymptotically optimal approach to CI construction for the mean with broad applicability in simulation, A/B testing, and resource-constrained data collection.

Abstract

We address the classical problem of constructing confidence intervals (CIs) for the mean of a distribution, given i.i.d. samples, such that the CI contains the true mean with probability at least , where \(δ\in (0,1)\). We characterize three distinct learning regimes based on the minimum achievable limiting width of any CI as the sample size and . In the first regime, where grows slower than \(\log(1/δ)\), the limiting width of any CI equals the width of the distribution's support, precluding meaningful inference. In the second regime, where scales as \(\log(1/δ)\), we precisely characterize the minimum limiting width, which depends on the scaling constant. In the third regime, where grows faster than \(\log(1/δ)\), complete learning is achievable, and the limiting width of the CI collapses to zero, converging to the true mean. We demonstrate that CIs derived from concentration inequalities based on Kullback--Leibler (KL) divergences achieve asymptotically optimal performance, attaining the minimum limiting width in both sufficient and complete learning regimes for distributions in two families: single-parameter exponential and bounded support. Additionally, these results extend to one-sided CIs, with the width notion adjusted appropriately. Finally, we generalize our findings to settings with random per-sample costs, motivated by practical applications such as stochastic simulators and cloud service selection. Instead of a fixed sample size, we consider a cost budget , identifying analogous learning regimes and characterizing the optimal CI construction policy.

Paper Structure

This paper contains 26 sections, 15 theorems, 94 equations, 1 figure, 2 tables.

Key Result

Theorem 1

For a given $\nu \in \mathbf{S}$ with mean $\mu$, and any $\pi \in \Pi^{s}_{\rm CI}$, the following holds: a) No learning regime : If $\lim_{\delta \to 0} \frac{N_\delta}{\log(1/\delta)}\to 0$ then, $\left[ \mu_{R}^{\pi}(\mu) - \mu_{L}^{\pi}(\mu) \right] = \overline{\mu} - \underline{\mu}.$ Furthe where, $\mu_{L}^{*}(\mu,k) <\mu$ and $\mu_{R}^{*}(\mu,k) > \mu$ uniquely solve the following system

Figures (1)

  • Figure 1: Comparison of our asymptotic lower bound given in Theorem \ref{['thm_spef_lower_bound_width']} as a function of $k$, with the lower bound presented in Proposition 4.3 of shekhar2023near when $\lim_{\delta \to 0} \frac{N_\delta}{\log(1/\delta)} = k$. We assume that $\nu = N(0,1)$ with known variance.

Theorems & Definitions (22)

  • Definition 1
  • Theorem 1
  • Theorem 2
  • Definition 2
  • Theorem 3
  • Remark 1
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • ...and 12 more