Table of Contents
Fetching ...

Power comparison of sequential testing by betting procedures

Amaury Durand, Olivier Wintenberger

TL;DR

This work develops a comprehensive theory for safe anytime valid sequential testing using test supermartingales, focusing on bounded-mean hypotheses and two betting-based procedures: Hoeffding and Capital, including a two-step capital variant. It provides non-asymptotic power guarantees, introduces variance-constrained alternatives, and derives explicit bounds on rejection times under general (and time-varying) alternatives, including multidimensional settings. The authors extend the framework to composite-null and other functionals, and demonstrate applications to forecaster evaluation and comparative testing, with extensive numerical simulations showing the relative strengths of FTL Hoeffding, EWA/ONS Capital, and 2-step strategies. The results highlight a detection boundary of order $\mathcal{O}(\log n / n)$ for the Capital test under suitable second-order conditions, while dimension and variance considerations guide the choice of betting strategy in practice. Overall, the paper advances safe sequential inference by linking online betting strategies, explicit power guarantees, and practical applications in forecasting and evaluation.

Abstract

In this paper, we derive power guarantees of some sequential tests for bounded mean under general alternatives. We focus on testing procedures using nonnegative supermartingales which are anytime valid and consider alternatives which coincide asymptotically with the null (e.g. vanishing mean) while still allowing to reject in finite time. Introducing variance constraints, we show that the alternative can be broaden while keeping power guarantees for certain second-order testing procedures. We also compare different test procedures in multidimensional setting using characteristics of the rejection times. Finally, we extend our analysis to other functionals as well as testing and comparing forecasters. Our results are illustrated with numerical simulations including bounded mean testing and comparison of forecasters.

Power comparison of sequential testing by betting procedures

TL;DR

This work develops a comprehensive theory for safe anytime valid sequential testing using test supermartingales, focusing on bounded-mean hypotheses and two betting-based procedures: Hoeffding and Capital, including a two-step capital variant. It provides non-asymptotic power guarantees, introduces variance-constrained alternatives, and derives explicit bounds on rejection times under general (and time-varying) alternatives, including multidimensional settings. The authors extend the framework to composite-null and other functionals, and demonstrate applications to forecaster evaluation and comparative testing, with extensive numerical simulations showing the relative strengths of FTL Hoeffding, EWA/ONS Capital, and 2-step strategies. The results highlight a detection boundary of order for the Capital test under suitable second-order conditions, while dimension and variance considerations guide the choice of betting strategy in practice. Overall, the paper advances safe sequential inference by linking online betting strategies, explicit power guarantees, and practical applications in forecasting and evaluation.

Abstract

In this paper, we derive power guarantees of some sequential tests for bounded mean under general alternatives. We focus on testing procedures using nonnegative supermartingales which are anytime valid and consider alternatives which coincide asymptotically with the null (e.g. vanishing mean) while still allowing to reject in finite time. Introducing variance constraints, we show that the alternative can be broaden while keeping power guarantees for certain second-order testing procedures. We also compare different test procedures in multidimensional setting using characteristics of the rejection times. Finally, we extend our analysis to other functionals as well as testing and comparing forecasters. Our results are illustrated with numerical simulations including bounded mean testing and comparison of forecasters.

Paper Structure

This paper contains 42 sections, 28 theorems, 75 equations, 6 figures, 1 algorithm.

Key Result

Lemma 1.1

Let $(W_n)_{n\geq 1}$ and $(u_n)_{n\geq 1}$ be respectively be a nonnegative stochastic process and a deterministic sequence satisfying Then where $\tau_{\alpha}$ is defined in eq:def-tau and we define $\aleph((u_n)_{n\geq 1},x) := \inf \left\{n \geq 1\,:\; \inf_{k\geq n} u_k \geq x\right\}$.

Figures (6)

  • Figure 1: Truncated rejection times for Experiment 1 with $a=b=0$ (constant mean and variance).
  • Figure 2: Truncated rejection times for Experiment 1 with $a\geq 0$, $b\geq 0$ (decreasing mean and variance) and $m=0.4$, $d=5$.
  • Figure 3: Examples of $X_t$ (first column), $\mu_t$ (second column) and the bets obtained by the different strategies (other columns) in Experiment 2 for different values of $a$ (rows) and M.
  • Figure 4: Evolution with $a$ of the truncated rejection time for Experiment 2 for different values of $M$.
  • Figure 5: Examples of $\log(W_n)$ in Experiment 2 for different values of $a$ (columns) and $M$ (rows). Dashed horizontal line represents the rejection threshold $\log(1/\alpha)$.
  • ...and 1 more figures

Theorems & Definitions (55)

  • Definition 1.1
  • Lemma 1.1
  • proof
  • Proposition 2.1
  • proof
  • Proposition 2.2
  • Proposition 2.3
  • Proposition 2.4
  • Proposition 2.5
  • Proposition 2.6
  • ...and 45 more