Table of Contents
Fetching ...

STaR-Bets: Sequential Target-Recalculating Bets for Tighter Confidence Intervals

Václav Voráček, Francesco Orabona

TL;DR

STaR-Bets introduces a finite-horizon, betting-based framework to construct tight confidence intervals for the mean of bounded variables by using sequential target-recalibrating bets. The method leverages test martingales and an origin-$\bigstar$ betting philosophy to adapt bets to remaining rounds, achieving width on the order of $O\left(\sqrt{\frac{\sigma^2 \log\frac{1}{\delta}}{n}}\right)$ up to a $1+o(1)$ factor, and subsuming classical Hoeffding/Bernstein bounds with improved finite-sample performance. The STaR-Bets algorithm estimates the second-moment term online, discretizes the mean candidates, and proves finite-sample coverage guarantees while delivering near-optimal interval widths; the Bets variant attains the optimal rate up to negligible factors. Empirical results on Beta/Bernoulli distributions show competitive or superior interval tightness with guaranteed coverage, and the approach offers a practical, open-source implementation. Overall, the work advances betting-based CI methods by closing the finite-sample gap and delivering state-of-the-art performance in tightness and reliability for bounded means.

Abstract

The construction of confidence intervals for the mean of a bounded random variable is a classical problem in statistics with numerous applications in machine learning and virtually all scientific fields. In particular, obtaining the tightest possible confidence intervals is vital every time the sampling of the random variables is expensive. The current state-of-the-art method to construct confidence intervals is by using betting algorithms. This is a very successful approach for deriving optimal confidence sequences, even matching the rate of law of iterated logarithms. However, in the fixed horizon setting, these approaches are either sub-optimal or based on heuristic solutions with strong empirical performance but without a finite-time guarantee. Hence, no betting-based algorithm guaranteeing the optimal $\mathcal{O}(\sqrt{\frac{σ^2\log\frac1δ}{n}})$ width of the confidence intervals are known. This work bridges this gap. We propose a betting-based algorithm to compute confidence intervals that empirically outperforms the competitors. Our betting strategy uses the optimal strategy in every step (in a certain sense), whereas the standard betting methods choose a constant strategy in advance. Leveraging this fact results in strict improvements even for classical concentration inequalities, such as the ones of Hoeffding or Bernstein. Moreover, we also prove that the width of our confidence intervals is optimal up to an $1+o(1)$ factor diminishing with $n$. The code is available at https://github.com/vvoracek/STaR-bets-confidence-interval.

STaR-Bets: Sequential Target-Recalculating Bets for Tighter Confidence Intervals

TL;DR

STaR-Bets introduces a finite-horizon, betting-based framework to construct tight confidence intervals for the mean of bounded variables by using sequential target-recalibrating bets. The method leverages test martingales and an origin- betting philosophy to adapt bets to remaining rounds, achieving width on the order of up to a factor, and subsuming classical Hoeffding/Bernstein bounds with improved finite-sample performance. The STaR-Bets algorithm estimates the second-moment term online, discretizes the mean candidates, and proves finite-sample coverage guarantees while delivering near-optimal interval widths; the Bets variant attains the optimal rate up to negligible factors. Empirical results on Beta/Bernoulli distributions show competitive or superior interval tightness with guaranteed coverage, and the approach offers a practical, open-source implementation. Overall, the work advances betting-based CI methods by closing the finite-sample gap and delivering state-of-the-art performance in tightness and reliability for bounded means.

Abstract

The construction of confidence intervals for the mean of a bounded random variable is a classical problem in statistics with numerous applications in machine learning and virtually all scientific fields. In particular, obtaining the tightest possible confidence intervals is vital every time the sampling of the random variables is expensive. The current state-of-the-art method to construct confidence intervals is by using betting algorithms. This is a very successful approach for deriving optimal confidence sequences, even matching the rate of law of iterated logarithms. However, in the fixed horizon setting, these approaches are either sub-optimal or based on heuristic solutions with strong empirical performance but without a finite-time guarantee. Hence, no betting-based algorithm guaranteeing the optimal width of the confidence intervals are known. This work bridges this gap. We propose a betting-based algorithm to compute confidence intervals that empirically outperforms the competitors. Our betting strategy uses the optimal strategy in every step (in a certain sense), whereas the standard betting methods choose a constant strategy in advance. Leveraging this fact results in strict improvements even for classical concentration inequalities, such as the ones of Hoeffding or Bernstein. Moreover, we also prove that the width of our confidence intervals is optimal up to an factor diminishing with . The code is available at https://github.com/vvoracek/STaR-bets-confidence-interval.

Paper Structure

This paper contains 26 sections, 15 theorems, 33 equations, 5 figures, 1 table, 6 algorithms.

Key Result

Proposition 1

Let $W_0,W_1 \dots, W_n$ be a test process. For any $\delta \in (0,1)$ it holds that $\mathbb{P}\left\{W_n \geq \frac{1}{\delta} \right\} \leq \delta$.

Figures (5)

  • Figure 1: Comparison of the Algorithms \ref{['alg:hoeff_test']},\ref{['alg:star_hoeff_test']}, \ref{['alg:bern_test']}, and \ref{['alg:star_bern_test']} with $\delta = 0.05$ on $1000$ realizations of the Bernoulli random variable with mean $0.9$. (L): We show the final value of $W$ depending on the choice of $m$ for the algorithms. The vanilla versions have exponential dependency on $m$, while the ${\color{red}\hbox{origin=c]{180}{$\bigstar$}}}$ versions virtually always end up with $W \in \{0, \frac{1}{\delta}\}$. Additionally, we can confirm that the ${\color{red}\hbox{origin=c]{180}{$\bigstar$}}}$ versions reject the null hypothesis for more values of $m$. (R): Here we show the evolution of $W$ throughout the runs of the algorithms for $m=0.86$. We can see that the Bernstein's testing algorithm already achieved the required wealth, but later lost it, unlike ${\color{red}\hbox{origin=c]{180}{$\bigstar$}}}$-Bernstein's testing which stopped betting after reaching it. We can also see that towards the end, ${\color{red}\hbox{origin=c]{180}{$\bigstar$}}}$-Hoeffding betting started betting very aggressively in order to have a chance to reach the desired wealth.
  • Figure 2: We directly compare the widths of the confidence intervals. Note the $\log-\log$ scale. For all the methods and every $n=8, 16, \dots, 256$, we have estimated the mean $1000\times$ of a fresh realization of the corresponding random variable and plotted the average distance to the mean. (L): When estimating the mean of beta distribution, we observe that that with increasing $n$, we are getting closer to the performance of T-test. (R): When estimating Bernoulli mean, the performance of origin=c]180$\bigstar$-Bets is very similar to the specialized optimal methods.
  • Figure 3: CDF figure: When a curve corresponding to a method passes through a point $(x,y)$, it means that the $y$-fraction (of $1000$ repetitions) of lower confidence bounds was smaller than $x$. The vertical magenta line shows the mean position, and the vertical one shows the $1-\delta$ quantile. We can see that in both cases, origin=c]180$\bigstar$-Bets passes through the intersection, implying that the coverage is $\approx 1-\delta$. (Top) Estimation of the mean of Beta distribution (L): we can see that T-test produces shorter intervals than origin=c]180$\bigstar$-Bets, but that it has significantly smaller coverage than claimed. (R): Here, origin=c]180$\bigstar$-Bets produces shortest intervals. (Bottom) Estimation of the mean of a Bernoulli distribution using $30$ (L) and $1000$ (R) samples. We observe that in the low sample regime, origin=c]180$\bigstar$-Bets is mildly worse than the unbeatable randomized Clopper-Pearson, arguably better than standard Clopper-Pearson, and significantly better than Hedging of Waudby-SmithR21. In the regime of larger samples, we can see that origin=c]180$\bigstar$-Bets stays very close to the optimal intervals, while the competitor is still significantly worse.
  • Figure : Hoeffding testing
  • Figure : Testing with Bets

Theorems & Definitions (27)

  • Definition 1: Test process
  • Proposition 1: Markov's inequality
  • proof
  • Theorem 2
  • proof
  • Lemma 3: Hoeffding's lemma
  • Proposition 4
  • proof
  • Corollary 5
  • Remark 6
  • ...and 17 more