Spectral Regularized Kernel Two-Sample Tests

Omar Hagrass; Bharath K. Sriperumbudur; Bing Li

Spectral Regularized Kernel Two-Sample Tests

Omar Hagrass, Bharath K. Sriperumbudur, Bing Li

TL;DR

This work analyzes kernel-based two-sample testing on general domains through RKHS embeddings and reveals the non-optimality of the standard MMD statistic with respect to minimax separation over a locally smooth alternative class. It introduces a spectral regularized MMD statistic η_λ that incorporates covariance information via a spectral regularizer g_λ, achieving minimax optimal separation rates that depend on the eigen-decay of the underlying covariance operator and the smoothness θ. The authors further provide a practical, data-driven permutation test and an adaptive version that aggregates over λ (and kernels) to attain near-minimax optimality without prior knowledge of θ or β, along with theoretical guarantees and computation trade-offs. Extensive experiments on Gaussian, Cauchy, MNIST, and directional data demonstrate the proposed method’s robustness and improved power over MMD, Energy, and KS tests, especially in challenging, high-dimensional or non-Euclidean settings.

Abstract

Over the last decade, an approach that has gained a lot of popularity to tackle nonparametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show the popular MMD (maximum mean discrepancy) two-sample test to be not optimal in terms of the separation boundary measured in Hellinger distance. Second, we propose a modification to the MMD test based on spectral regularization by taking into account the covariance information (which is not captured by the MMD test) and prove the proposed test to be minimax optimal with a smaller separation boundary than that achieved by the MMD test. Third, we propose an adaptive version of the above test which involves a data-driven strategy to choose the regularization parameter and show the adaptive test to be almost minimax optimal up to a logarithmic factor. Moreover, our results hold for the permutation variant of the test where the test threshold is chosen elegantly through the permutation of the samples. Through numerical experiments on synthetic and real data, we demonstrate the superior performance of the proposed test in comparison to the MMD test and other popular tests in the literature.

Spectral Regularized Kernel Two-Sample Tests

TL;DR

Abstract

Paper Structure (36 sections, 37 theorems, 311 equations, 18 figures)

This paper contains 36 sections, 37 theorems, 311 equations, 18 figures.

Introduction
Definitions & Notation
Non-optimality of $D^2_{\text{MMD}}$ test
Spectral regularized MMD test
Test statistic
Oracle test
Permutation test
Adaptation
Choice of kernel
Experiments
Bechmark datasets
Gaussian distribution
Cauchy distribution
MNIST dataset
Directional data
...and 21 more sections

Key Result

Theorem 3.1

Suppose $(A_0)$ holds. Let $N\geq 2$, $M\geq 2$, $M \leq N \leq DM$, for some constant $D>1$, $k \in \{1,2\},$ and Then for any $\alpha>0$, $\delta>0,$$P_{H_0}\{\hat{D}_{\mathrm{MMD}}^2 \geq \gamma_k\} \leq \alpha,$ where $\gamma_1 = \frac{2\sqrt{6}\kappa}{\sqrt{\alpha}}\left(\frac{1}{N}+\frac{1}{M}\right)$, $\gamma_2 = q_{1-\alpha},$ $c_1(\alpha,\delta)\asymp\max\{\alpha^{-1/2},\delta^{-1}\}$ an

Figures (18)

Figure 1: Type-I error for different number of permutations.
Figure 2:
Figure 5:
Figure 8:
Figure 11:
...and 13 more figures

Theorems & Definitions (69)

Remark 3.1
Theorem 3.1: Separation boundary of MMD test
Remark 3.2
Theorem 3.2: Minimax separation boundary
Corollary 3.3: Minimax separation boundary-Polynomial decay
Corollary 3.4: Minimax separation boundary-Exponential decay
Remark 3.3
Remark 4.1
Remark 4.2
Theorem 4.1
...and 59 more

Spectral Regularized Kernel Two-Sample Tests

TL;DR

Abstract

Spectral Regularized Kernel Two-Sample Tests

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (18)

Theorems & Definitions (69)