Table of Contents
Fetching ...

Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations

Aurelien Bibaut, Nathan Kallus, Michael Lindon

TL;DR

This work considers delayed-start normal-mixture sequential probability ratio tests and provides the first asymptotic type-I-error and expected-rejection-time guarantees under general non-parametric data generating processes, where the asymPTotics are indexed by the test's burn-in time.

Abstract

Sequential tests and their implied confidence sequences, which are valid at arbitrary stopping times, promise flexible statistical inference and on-the-fly decision making. However, strong guarantees are limited to parametric sequential tests that under-cover in practice or concentration-bound-based sequences that over-cover and have suboptimal rejection times. In this work, we consider classic delayed-start normal-mixture sequential probability ratio tests, and we provide the first asymptotic type-I-error and expected-rejection-time guarantees under general non-parametric data generating processes, where the asymptotics are indexed by the test's burn-in time. The type-I-error results primarily leverage a martingale strong invariance principle and establish that these tests (and their implied confidence sequences) have type-I error rates asymptotically equivalent to the desired (possibly varying) $α$-level. The expected-rejection-time results primarily leverage an identity inspired by Itô's lemma and imply that, in certain asymptotic regimes, the expected rejection time is asymptotically equivalent to the minimum possible among $α$-level tests. We show how to apply our results to sequential inference on parameters defined by estimating equations, such as average treatment effects. Together, our results establish these (ostensibly parametric) tests as general-purpose, non-parametric, and near-optimal. We illustrate this via numerical simulations and a real-data application to A/B testing at Netflix.

Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations

TL;DR

This work considers delayed-start normal-mixture sequential probability ratio tests and provides the first asymptotic type-I-error and expected-rejection-time guarantees under general non-parametric data generating processes, where the asymPTotics are indexed by the test's burn-in time.

Abstract

Sequential tests and their implied confidence sequences, which are valid at arbitrary stopping times, promise flexible statistical inference and on-the-fly decision making. However, strong guarantees are limited to parametric sequential tests that under-cover in practice or concentration-bound-based sequences that over-cover and have suboptimal rejection times. In this work, we consider classic delayed-start normal-mixture sequential probability ratio tests, and we provide the first asymptotic type-I-error and expected-rejection-time guarantees under general non-parametric data generating processes, where the asymptotics are indexed by the test's burn-in time. The type-I-error results primarily leverage a martingale strong invariance principle and establish that these tests (and their implied confidence sequences) have type-I error rates asymptotically equivalent to the desired (possibly varying) -level. The expected-rejection-time results primarily leverage an identity inspired by Itô's lemma and imply that, in certain asymptotic regimes, the expected rejection time is asymptotically equivalent to the minimum possible among -level tests. We show how to apply our results to sequential inference on parameters defined by estimating equations, such as average treatment effects. Together, our results establish these (ostensibly parametric) tests as general-purpose, non-parametric, and near-optimal. We illustrate this via numerical simulations and a real-data application to A/B testing at Netflix.
Paper Structure (77 sections, 33 theorems, 150 equations, 6 figures, 1 table)

This paper contains 77 sections, 33 theorems, 150 equations, 6 figures, 1 table.

Key Result

Lemma 1

For any $t_0 \geq 1$ and $\lambda > 0$, Equivalently,

Figures (6)

  • Figure 1: Empirical calibration on Brownian data of $\lambda$ for the normal mixture SPRT without burn-in period.
  • Figure 2: Type-I error as a function of $t_0$. The confidence bands are 95% pointwise Wald-type confidence sequences. We obtain each point in the above two plots by simulating trajectories of $(S_t, t \in \{t_1,\ldots,t_{\max}\})$. The dashed black line represents the nominal $\alpha$ level, of which the value is indicated on the right side of the plots.
  • Figure 3: Relative efficiency, defined as $\mathrm{median}(\tau) / \mathrm{median}(\tau^{\mathrm{svs}})$. We obtain each point by simulating 8000 trajectories of $(S_t)_{t \in \{t_1,\ldots,t_{\max}\} }$, for i.i.d. observations $O_1,O_2\ldots, \sim \mu + (\mathrm{Bernoulli}(0.03)-0.03)$.
  • Figure 4: Median, first and third quartile of the ratio of the rejection time with the asymptotic upper bound $2 \mu^{-2} \log \mu^{-1}$ from \ref{['thm:rejection_time_upper_bounds']}, for the burn-in oracle simple-vs-simple SPRT, burn-in nmSPRT, burn-in rmlSPRT, and burn-in nmSPRT with adaptive $\lambda$. We obtain each point by simulating 8000 trajectories of $(S_t)_{t \in \{t_1,\ldots,t_{\max}\} }$, for i.i.d. observations $O_1,O_2\ldots \sim \mu + (\mathrm{Bernoulli}(0.03)-0.03)$. We set $\alpha = 5 \cdot 10^{-2}$ for the current plot. We set the burn-in period $t_0$ to 200. The panel labels at the top represent the value of $\alpha$.
  • Figure 5: nmSPRT and rMLE test statistics trajectories on one arbitrary draw of the data sequence in the play delay case study.
  • ...and 1 more figures

Theorems & Definitions (66)

  • Example 1: Sample mean
  • Example 2: Bernoulli trial with covariates
  • Lemma 1
  • Theorem 1
  • Lemma 2
  • Theorem 2
  • Theorem 3
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • ...and 56 more