Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy

Omar Hagrass; Bharath Sriperumbudur; Krishnakumar Balasubramanian

Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy

Omar Hagrass, Bharath Sriperumbudur, Krishnakumar Balasubramanian

TL;DR

This paper addresses the problem of establishing minimax-optimal goodness-of-fit tests on general domains by leveraging kernel Stein discrepancy (KSD). It introduces an operator-theoretic representation of $D_{\mathrm{KSD}}$ and proposes a spectral-regularized discrepancy $D^2_{\lambda}$, along with a practical estimator $\hat{\mathbb S}_{\lambda}^{P}$ that uses data without requiring full knowledge of the null or extra null samples. The authors prove minimax optimal separation radii $\Delta_n$ under both polynomial and exponential eigenvalue decays, show that unregularized KSD is suboptimal, and provide an adaptive union test across $\lambda$ that achieves minimax optimality up to a $\log\log$ factor. Extensive experiments on Euclidean, spherical, and infinite-dimensional domains demonstrate the empirical superiority of the regularized KSD tests over unregularized variants, while remaining computationally efficient via a U-statistic framework and wild bootstrap thresholds. The work thus offers robust, domain-agnostic goodness-of-fit testing with strong theoretical guarantees and practical adaptability.

Abstract

We explore the minimax optimality of goodness-of-fit tests on general domains using the kernelized Stein discrepancy (KSD). The KSD framework offers a flexible approach for goodness-of-fit testing, avoiding strong distributional assumptions, accommodating diverse data structures beyond Euclidean spaces, and relying only on partial knowledge of the reference distribution, while maintaining computational efficiency. Although KSD is a powerful framework for goodness-of-fit testing, only the consistency of the corresponding tests has been established so far, and their statistical optimality remains largely unexplored. In this paper, we develop a general framework and an operator-theoretic representation of the KSD, encompassing many existing KSD tests in the literature, which vary depending on the domain. Building on this representation, we propose a modified discrepancy by applying the concept of spectral regularization to the KSD framework. We establish the minimax optimality of the proposed regularized test for a wide range of the smoothness parameter $θ$ under a specific alternative space, defined over general domains, using the $χ^2$-divergence as the separation metric. In contrast, we demonstrate that the unregularized KSD test fails to achieve the minimax separation rate for the considered alternative space. Additionally, we introduce an adaptive test capable of achieving minimax optimality up to a logarithmic factor by adapting to unknown parameters. Through numerical experiments, we illustrate the superior performance of our proposed tests across various domains compared to their unregularized counterparts.

Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy

TL;DR

and proposes a spectral-regularized discrepancy

, along with a practical estimator

that uses data without requiring full knowledge of the null or extra null samples. The authors prove minimax optimal separation radii

under both polynomial and exponential eigenvalue decays, show that unregularized KSD is suboptimal, and provide an adaptive union test across

that achieves minimax optimality up to a

factor. Extensive experiments on Euclidean, spherical, and infinite-dimensional domains demonstrate the empirical superiority of the regularized KSD tests over unregularized variants, while remaining computationally efficient via a U-statistic framework and wild bootstrap thresholds. The work thus offers robust, domain-agnostic goodness-of-fit testing with strong theoretical guarantees and practical adaptability.

Abstract

under a specific alternative space, defined over general domains, using the

-divergence as the separation metric. In contrast, we demonstrate that the unregularized KSD test fails to achieve the minimax separation rate for the considered alternative space. Additionally, we introduce an adaptive test capable of achieving minimax optimality up to a logarithmic factor by adapting to unknown parameters. Through numerical experiments, we illustrate the superior performance of our proposed tests across various domains compared to their unregularized counterparts.

Paper Structure (25 sections, 19 theorems, 214 equations, 8 figures)

This paper contains 25 sections, 19 theorems, 214 equations, 8 figures.

Introduction
Minimax framework
Contributions
Definitions & notation
KSD-based tests via spectral regularization
Adaptation to $\lambda$ by aggregation
Interpreting the class of alternatives, $\mathcal{P}$
Experiments
Gaussian-Bernoulli restricted Boltzmann machine
Directional data
Brownian motion
Discussion
Proofs
Proof of Proposition \ref{['thm: computation']}
Proof of Theorem \ref{['Type-I error']}
...and 10 more sections

Key Result

Proposition 1

Define the inclusion operator $\mathfrak J : \mathscr H_{K_0} \to L^{2}(P_0)$, $f \mapsto [f]_{\sim}$. Then under assump:a0, where $u:=\frac{dP}{dP_0}-1$, $\mathfrak J^*$ is the adjoint of $\mathfrak J$, defined as $\mathfrak J^* : L^{2}(P_0) \to \mathscr H_{K_0}, \ f \mapsto \int K_0(\cdot,x)f(x)\,dP_0(x),$$\Upsilon_{P_0}:=\mathfrak J\mathfrak J^*$ is the integral operator with $(\lambda_i,\tild

Figures (8)

Figure 1: Power of the tests for Gaussian-Bernoulli restricted Boltzmann machine with $d=50$, $n_2=100$ and $n=1000$.
Figure 2: Power for mixture of Watson distributions for different concentration parameter $k$ using $n_2=100$ and $n=500$.
Figure 3: Power for different values for $\delta$ using $n_2=35$ and $n=70$.
Figure 4: Power for different values of $k$ using $n_2=100$ and $n=200$.
Figure 5: Power of the KSD(Tikhonov) test with varying choices of $n_2$ for Gaussian-Bernoulli restricted Boltzmann machine with $d=50$ and a total sample size of $n=1000$.
...and 3 more figures

Theorems & Definitions (49)

Example 1: KSD on $\mathcal{X}=\mathbb{R}^d$
Example 2: KSD on a Riemannian manifold
Example 3: KSD on Hilbert spaces
Proposition 1
proof
Remark 1
Proposition 2
Remark 2
Theorem 3
Remark 3
...and 39 more

Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy

TL;DR

Abstract

Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (49)