Sequential Kernelized Stein Discrepancy

Diego Martinez-Taboada; Aaditya Ramdas

Sequential Kernelized Stein Discrepancy

Diego Martinez-Taboada, Aaditya Ramdas

TL;DR

Sequential Kernelized Stein Discrepancy introduces a sequential, anytime-valid goodness-of-fit test for unnormalized densities by marrying kernelized Stein discrepancy with a betting-based martingale framework. It removes the common requirement of uniform Stein kernel boundedness by exploiting pointwise bounds and a normalization $M_p$ to construct nonnegative wealth processes, enabling continuous monitoring and adaptive stopping. The paper proves validity under the null and establishes exponential wealth growth under alternatives, with empirical demonstrations on Gaussian, intractable, and Gaussian-Bernoulli RBM models. This approach broadens KS-based goodness-of-fit testing to complex energy-based models and MCMC diagnostics, offering resource-efficient, flexible hypothesis testing in practice.

Abstract

We present a sequential version of the kernelized Stein discrepancy goodness-of-fit test, which allows for conducting goodness-of-fit tests for unnormalized densities that are continuously monitored and adaptively stopped. That is, the sample size need not be fixed prior to data collection; the practitioner can choose whether to stop the test or continue to gather evidence at any time while controlling the false discovery rate. In stark contrast to related literature, we do not impose uniform boundedness on the Stein kernel. Instead, we exploit the potential boundedness of the Stein kernel at arbitrary point evaluations to define test martingales, that give way to the subsequent novel sequential tests. We prove the validity of the test, as well as an asymptotic lower bound for the logarithmic growth of the wealth process under the alternative. We further illustrate the empirical performance of the test with a variety of distributions, including restricted Boltzmann machines.

Sequential Kernelized Stein Discrepancy

TL;DR

to construct nonnegative wealth processes, enabling continuous monitoring and adaptive stopping. The paper proves validity under the null and establishes exponential wealth growth under alternatives, with empirical demonstrations on Gaussian, intractable, and Gaussian-Bernoulli RBM models. This approach broadens KS-based goodness-of-fit testing to complex energy-based models and MCMC diagnostics, offering resource-efficient, flexible hypothesis testing in practice.

Abstract

Paper Structure (26 sections, 7 theorems, 76 equations, 10 figures, 1 algorithm)

This paper contains 26 sections, 7 theorems, 76 equations, 10 figures, 1 algorithm.

INTRODUCTION
RELATED WORK
BACKGROUND
The Kernelized Stein Discrepancy
Testing by Betting
SEQUENTIAL GOODNESS-OF-FIT BY BETTING
Simple Null Hypothesis
Composite Null Hypothesis
DERIVATION OF SENSIBLE BOUNDS
A General Approach
Specific Examples of Bound Derivations
EXPERIMENTS
CONCLUSION
ADDITIONAL EXPERIMENTS
Logarithmic wealth process of a Gaussian-Bernoulli restricted Boltzmann machine
...and 11 more sections

Key Result

Theorem 1

Assume that $\mathbb{E}_{H_0}[h_p(X, X')]=0$, and let $\lambda_t \in [0, 1]$ be predictable. The wealth process where $g_t$ is defined as in eq:gt_definition, is a test martingale. The stopping time defines a level-$\alpha$ sequential test.

Figures (10)

Figure 1: Average logarithmic wealth alongside $95\%$ empirical confidence intervals for $1000$ simulations under the alternative for (I) the Gaussian distribution, (II) the intractable model. We highlight the exponential growth of the wealth.
Figure 2: Proportion of rejections for the Gaussian distribution considered in Section \ref{['sec:experiments']} under the alternatives $\theta_1 \in \{ 0.40, 0.42, 0.44, 0.46, 0.48, 0.50 \}$ for (I) the (classical) batch setting kernelized Stein discrepancy with sample size $n$, (II) the (proposed) sequential kernelized Stein discrepancy. The (proposed) sequential test always ends up rejecting the null hypotheses, while the batch test will not do so if the original sample size is too small.
Figure 3: Average logarithmic wealth alongside $95\%$ empirical confidence intervals for $1000$ simulations under the alternative for (I) the restricted Boltzmann machine with shifted $B$, (II) the restricted Boltzmann machine with bias $b=1$. We highlight once again the exponential growth of the wealth processes.
Figure 4: Average wealth alongside $95\%$ empirical confidence intervals for $1000$ simulations under the null for (I) the Gaussian distribution, (II) the intractable model, (III) the restricted Boltzmann machine. We emphasize that the wealth processes do not cross the threshold $1/0.05$, and hence the nulls are not rejected, showing the empirical type-I error control.
Figure 5: Proportion of rejections for $500$ simulations for the Gaussian distribution considered in Section \ref{['sec:experiments']} under three different alternatives with (I) $\theta = 0.5$, (II) $\theta = 0.75$, (III) $\theta = 1$. We emphasize that the (proposed) sequential test always ends up rejecting the null hypotheses, while the batch test will not do so if the original sample size is too small.
...and 5 more figures

Theorems & Definitions (14)

Theorem 1: Validity under null.
Definition 1: aGRAPA strategy
Definition 2: LBOW strategy
Theorem 2: E-power under alternative
Theorem 3
Theorem 4: SLLN for Banach-valued random variables. bosq2000linear.
Theorem 5: Ville's inequality.
Proposition 1
proof
Proposition 2
...and 4 more

Sequential Kernelized Stein Discrepancy

TL;DR

Abstract

Sequential Kernelized Stein Discrepancy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (14)