Post-Hoc Large-Sample Statistical Inference

Ben Chugg; Etienne Gauthier; Michael I. Jordan; Aaditya Ramdas; Ian Waudby-Smith

Post-Hoc Large-Sample Statistical Inference

Ben Chugg, Etienne Gauthier, Michael I. Jordan, Aaditya Ramdas, Ian Waudby-Smith

Abstract

We derive inferential procedures for large sample sizes that remain valid under data-dependent significance levels (so-called "post-hoc valid inference"). Classical statistical tools require that the significance level -- the "type-I error" -- is selected prior to seeing or analyzing any data. This restriction leads to some drawbacks. For instance, if an analyst generates an inconclusive confidence interval, repeating the process with a larger significance level is not an option -- the result is final. Recently, e-values have emerged as the solution to this problem, being both necessary and sufficient tools for performing various forms of post-hoc inference. All such results, however, have thus far been nonasymptotic. As a result, they inherit familiar limitations of nonasymptotic inferential procedures such as requiring strong moment assumptions and being conservative in general. This paper develops a theory of post-hoc inference in the asymptotic setting, yielding asymptotic post-hoc confidence sets and asymptotic post-hoc p-values that make weaker assumptions and are sharper than their nonasymptotic counterparts.

Post-Hoc Large-Sample Statistical Inference

Abstract

Paper Structure (60 sections, 29 theorems, 210 equations, 7 figures, 1 table)

This paper contains 60 sections, 29 theorems, 210 equations, 7 figures, 1 table.

Introduction
Contributions and outline
Related work
Notation and background
Preliminaries
Post-hoc confidence intervals and p-values
Asymptotic post-hoc CIs and p-values
Distribution-uniform asymptotic post-hoc CIs and p-values
The sufficiency and necessity of asymptotic e-values
Constructing Asymptotic Post-Hoc Confidence Intervals
The IWR asymptotic e-variable and APH-CI
Choosing parameters, Option I: Ex ante anchoring
Choosing parameters, Option II: The method of mixtures
Event partitioning and the R-WS asymptotic e-variable
Simulations
...and 45 more sections

Key Result

Proposition 2.6

Let $\Theta$ be a set that can be thought of as the "parameter space" and let $(\mathcal{H}_n(\alpha))_{\alpha>0}$ be a family of subsets of $\Theta$ that is monotonic and right-continuous in $\alpha$, meaning: Then the sequence $(\mathcal{H}_n)_{n \ge 1}$ is a (uniform) aph-ci for $\theta^\star$ if and only if there exists a sequence of random variables $(E_n(\theta))$ for each $\theta\in\Theta$

Figures (7)

Figure 1: Empirical evaluation of the ratio in \ref{['eq:g_ratio']}. Here $\alpha_0$ ranges from 0.001 to 0.2 and we plot the curves $\alpha_0\mapsto R(\alpha,\alpha_0)$ for select values of $\alpha$. The maximum value $R(\alpha,\alpha_0)$ across all values of $\alpha$ and $\alpha_0$ is 1.184.
Figure 2: The width of four aph-cis compared to the Wald CI for Gaussian data and heavy-tailed data coming from a t-distribution with three degrees of freedom. "IWR" refers to the aph-ci of Theorem \ref{['thm:iwr-aphci']} with $\lambda$ chosen via ex ante anchoring. "MIX IWR" refers to the aph-ci of Theorem \ref{['thm:iwr-mixture-aphci']}. We use $\rho=2$ for $\mathcal{H}_n^{\textsc{r-ws}}\xspace$. See Appendix \ref{['app:sims']} for further details.
Figure 3: Comparison of the widths of some of our aph-cis compared to nonasymptotic CIs based on (nonasymptotic) e-variables. For bounded data, which we take to be iid Bernoulli(0.25), we compare to the classical Bernstein CI and also to the state-of-the-art betting-based empirical Bernstein CI of waudby2024estimating. For sub-Gaussian observations (generated as $N(0,1)$ random variables) we compare to the standard CI based on the Chernoff method---see the text for more detail. We use $\alpha=0.05$ across all simulations, and we implement $\mathcal{H}_n^{\textsc{iwr}}\xspace$ with $\lambda = \sqrt{2\log(2/\alpha)}$.
Figure 4: Asymptotic Type-I error of the proposed aph-cis as a function of their respective tuning parameters. The nominal significance level is fixed at $\alpha=0.05$ (dotted black line). Since the true asymptotic errors are much smaller than the nominal level, reflecting the conservative cost of ensuring post-hoc validity, the y-axis is shown on a logarithmic scale. Left: Error for iwr and reg ($\eta \in \{0.1,0.5\}$) across values of $\lambda$. The r-ws error is plotted at the bottom of the log-scale as its theoretical asymptotic type-I error is exactly zero. Right: Error for mix,iwr across $\kappa$ for varying truncation radii ($R\in\{2,10,20\}$). Note that for large truncation bounds $(R\ge10)$, the distribution closely matches an untruncated Gaussian, making the $R=10$ and $R=20$ curves nearly identical.
Figure 5: Effect of $R$ and $\kappa$ on the widths of aph-cis based on the truncated Gaussian mixture. Observations are drawn iid from a centered Gaussian with variance one. We use $\alpha=0.05$ across all figures.
...and 2 more figures

Theorems & Definitions (51)

Definition 2.1: Post-hoc confidence intervals and p-values
Remark 2.2: On the range of $\alpha$ extending beyond 1
Remark 2.3: From post-hoc risk to type-I error for data-independent $\alpha$ values
Definition 2.4: Asymptotic post-hoc confidence intervals and p-values
Definition 2.5: Distribution-uniform aph-cis and aph-pvals
Proposition 2.6
Remark 2.7
Theorem 3.1
Theorem 3.2
Proposition 3.3
...and 41 more

Post-Hoc Large-Sample Statistical Inference

Abstract

Post-Hoc Large-Sample Statistical Inference

Authors

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (51)