Table of Contents
Fetching ...

Equivalence testing with data-dependent and post-hoc equivalence margins

Stan Koobs, Nick W. Koning

Abstract

Equivalence testing compares the hypothesis that an effect $μ$ is large against the alternative that it is negligible. Here, `large' is classically expressed as being larger than some `equivalence margin' $Δ$. A longstanding problem is that this margin must be specified but can rarely be objectively justified in practice. We lay the foundation for an alternative paradigm, arguing to instead report a data-dependent margin $\widehatΔ_α$ that bounds the true effect $μ$ with probability $1 - α$. Our key argument is that $\widehatΔ_α$ is more useful than a test outcome at a fixed margin $Δ$, as measured by the guarantees it offers to decision makers. We generalize this to a curve of margins $α\mapsto \widehatΔ_α$, uniformly valid under the post-hoc selection of the margin. These ideas rely on e-values, which we derive for models that are strictly totally positive of order 3, nesting the classical z-test and t-test settings.

Equivalence testing with data-dependent and post-hoc equivalence margins

Abstract

Equivalence testing compares the hypothesis that an effect is large against the alternative that it is negligible. Here, `large' is classically expressed as being larger than some `equivalence margin' . A longstanding problem is that this margin must be specified but can rarely be objectively justified in practice. We lay the foundation for an alternative paradigm, arguing to instead report a data-dependent margin that bounds the true effect with probability . Our key argument is that is more useful than a test outcome at a fixed margin , as measured by the guarantees it offers to decision makers. We generalize this to a curve of margins , uniformly valid under the post-hoc selection of the margin. These ideas rely on e-values, which we derive for models that are strictly totally positive of order 3, nesting the classical z-test and t-test settings.
Paper Structure (61 sections, 13 theorems, 76 equations, 5 figures)

This paper contains 61 sections, 13 theorems, 76 equations, 5 figures.

Key Result

Proposition 1

A data-dependent margin $\widehat{\Delta}_\alpha$ corresponds to a non-decreasing right-continuous curve $\Delta \mapsto \phi_\Delta^\alpha$ of tests. $\widehat{\Delta}_\alpha$ is valid if and only if $\phi^{\alpha}_\Delta$ is valid for $H_0^\Delta$ for every $\Delta \geq 0$.

Figures (5)

  • Figure 1: The left panel shows two uniformly valid equivalence curves $\alpha \mapsto \widehat{\Delta}_\alpha$, where the dashed line corresponds to a fixed-$\alpha$ data-dependent margin. The right panel re-expresses these as curves of e-values $\Delta \mapsto \varepsilon_\Delta$.
  • Figure 2: Comparison of four procedures to assess equivalence on the mean $\mu$ in the Gaussian location model. Left: positive half of equivalence curves $\alpha \mapsto \widehat{\Delta}_\alpha$. Right: curves of e-values $\Delta \mapsto \varepsilon_\Delta$. The setup for the displayed realization is: $\bar{X}=0.05$, $n=40$ and $\sigma=1$.
  • Figure 3: Visualization of the STP$_2$ and STP$_3$ regions in the three-dimensional probability simplex for the example in Section \ref{['sec:tp2tp3']}. The gray triangle is the simplex of probability vectors. The yellow region contains points $p$ for which $(p_{\mu_1},p_{\mu_2},p)$ is STP$_2$ but not STP$_3$, while the green region contains points $p$ for which $(p_{\mu_1},p_{\mu_2},p)$ is STP$_3$ (and hence also STP$_2$).
  • Figure 4: Comparison of sequential TOST-E and symmetric $t$-squared test under the alternative $\delta=0$, with $X_1,\ldots,X_n\sim\mathcal{N}(0,1)$, sample sizes $2\le n\le 50$, and $M=50{,}000$ replications.
  • Figure 5: Comparison of sequential TOST-E and product of numeraires for asymmetric margins $(\Delta^-,\Delta^+)=(-0.6,0.4)$ under $X_1,\ldots,X_n\sim\mathcal{N}(\mu,1)$, with $2\le n\le 75$ and $M=200{,}000$ replications.

Theorems & Definitions (46)

  • Definition 1: Test
  • Definition 2: E-value
  • Definition 3: Valid data-dependent margin
  • Remark 1
  • Remark 2
  • Proposition 1
  • Remark 3: Non-decreasing in $\Delta$
  • Definition 4
  • Theorem 1
  • Remark 4
  • ...and 36 more