Table of Contents
Fetching ...

A distribution-free valid p-value for finite samples of bounded random variables

Joaquin Alvarez

TL;DR

The paper develops distribution-free, valid p-values (super-uniform) for testing whether the mean loss $R=\mathbb{E}[L_i]$ exceeds a threshold $\alpha$, using only the boundedness $L_i\in[0,1]$. It builds on a Bernstein-Hoeffding-type concentration bound from Pelekis, Ramon and Wang to define a function $g(t;R)$ and a derived $\gamma(R)$ that yield a PRW valid p-value $p_{\text{PRW}} = g(\min\{\hat{R}, (\gamma(\alpha)-1)/n\}; \alpha)$. The main result shows that this p-value is valid for testing $H_0: R>\alpha$ vs $H_1: R\le\alpha$, with monotonicity properties and a well-defined inverse mapping $g^{-1}$ to facilitate calibration. The work also discusses comparisons to Bentkus and Hoeffding-based valid p-values and highlights applicability to FWER-controlling procedures and distribution-free uncertainty quantification in predictive inference contexts.

Abstract

We build a valid p-value based on a concentration inequality for bounded random variables introduced by Pelekis, Ramon and Wang. The motivation behind this work is the calibration of predictive algorithms in a distribution-free setting. The super-uniform p-value is tighter than Hoeffding and Bentkus alternatives in certain regions. Even though we are motivated by a calibration setting in a machine learning context, the ideas presented in this work are also relevant in classical statistical inference. Furthermore, we compare the power of a collection of valid p- values for bounded losses, which are presented in previous literature.

A distribution-free valid p-value for finite samples of bounded random variables

TL;DR

The paper develops distribution-free, valid p-values (super-uniform) for testing whether the mean loss exceeds a threshold , using only the boundedness . It builds on a Bernstein-Hoeffding-type concentration bound from Pelekis, Ramon and Wang to define a function and a derived that yield a PRW valid p-value . The main result shows that this p-value is valid for testing vs , with monotonicity properties and a well-defined inverse mapping to facilitate calibration. The work also discusses comparisons to Bentkus and Hoeffding-based valid p-values and highlights applicability to FWER-controlling procedures and distribution-free uncertainty quantification in predictive inference contexts.

Abstract

We build a valid p-value based on a concentration inequality for bounded random variables introduced by Pelekis, Ramon and Wang. The motivation behind this work is the calibration of predictive algorithms in a distribution-free setting. The super-uniform p-value is tighter than Hoeffding and Bentkus alternatives in certain regions. Even though we are motivated by a calibration setting in a machine learning context, the ideas presented in this work are also relevant in classical statistical inference. Furthermore, we compare the power of a collection of valid p- values for bounded losses, which are presented in previous literature.
Paper Structure (8 sections, 5 theorems, 23 equations, 2 figures, 1 table)

This paper contains 8 sections, 5 theorems, 23 equations, 2 figures, 1 table.

Key Result

Theorem 2.1

Let $X_1,\dots,X_n$ be independent and identically distributed (i.i.d.) random variables such that $0\leq X_i\leq 1$ and $p\coloneqq \mathbb{E}[X_i]$. Let $n\in \mathbb{N}$, $p\in(0,1)$ and $Bin(n,p)$ denote a Binomial random variable with parameters $(n,p)$. Then for any positive integer $t$ such t

Figures (2)

  • Figure 1: A graph of $\lceil nt\rceil$ in $[0,1]$, with some hypothetical critical values to make sure that the upper bound is well defined. This figure motivates the definition of $\gamma(R)$. The function $\lceil nt\rceil$ takes positive jumps at $\{0,\frac{1}{n}, \frac{2}{n},\dots, \frac{n-1}{n}\}$.
  • Figure 2: In the same setting as in \ref{['hyp']}, we consider the following valid p-values as presented in angelopoulos2022learn and bates2021distributionfree: $p_{Bent}\coloneqq e\mathbb{P}\{Bin(n,\alpha)\leq \lceil n\hat{R}\rceil\}$, Bentkus' valid p-value, $p_{HT}\coloneqq exp\{-n(\text{min}\{\hat{R}, \alpha\}log(\frac{\text{min}\{\hat{R}, \alpha\}}{\alpha})+ (1-\text{min}\{\hat{R}, \alpha\})log(\frac{(1-\text{min}\{\hat{R}, \alpha\})}{1-\alpha}) ) \}$ Hoeffding's valid p-value (tight version). We consider the valid p-value introduced in this work: $g(\text{min}\{\hat{R}, \frac{\gamma(\alpha)-1}{n}\};\alpha)$, PRW's valid p-value.

Theorems & Definitions (5)

  • Theorem 2.1
  • Corollary 2.1.1
  • Lemma 2.2
  • Lemma 2.3
  • Theorem 2.4