Kernel Two-Sample Tests for Manifold Data
Xiuyuan Cheng, Yao Xie
TL;DR
This paper develops non-asymptotic theory for kernel-based two-sample tests applied to data lying on or near low-dimensional manifolds, showing that the test power depends on intrinsic dimension $d$, Hölder smoothness $\beta$, and the $L^2$-divergence $\Delta_2$ between the densities. A key result is that with bandwidth $\gamma$ scaled as $n^{-1/(d+4\beta)}$, the detection rate satisfies $\Delta_2 \gtrsim n^{-2\beta/(d+4\beta)}$, enabling the test to overcome the curse of dimensionality on manifolds; the theory remains valid for non-PSD kernels as well. The authors extend the framework to manifolds with boundary and to data corrupted by additive noise, showing that the same finite-sample guarantees hold under reasonable conditions, including a near-boundary belt argument and Gaussian noise bounds. They validate the theory through numerical experiments on synthetic manifold data and the MNIST dataset, demonstrating that smaller bandwidths than the median distance can improve power when the intrinsic dimension is low and samples are plentiful, and that non-PSD kernels can still be effective. The work suggests a broader class of kernel-based tests for manifold data and informs bandwidth selection strategies to exploit intrinsic structure in high-dimensional settings.
Abstract
We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize the test level and power in relation to the kernel bandwidth, the number of samples, and the intrinsic dimensionality of the manifold. Specifically, when data densities $p$ and $q$ are supported on a $d$-dimensional sub-manifold ${M}$ embedded in an $m$-dimensional space and are Hölder with order $β$ (up to 2) on ${M}$, we prove a guarantee of the test power for finite sample size $n$ that exceeds a threshold depending on $d$, $β$, and $Δ_2$ the squared $L^2$-divergence between $p$ and $q$ on the manifold, and with a properly chosen kernel bandwidth $γ$. For small density departures, we show that with large $n$ they can be detected by the kernel test when $Δ_2$ is greater than $n^{- { 2 β/( d + 4 β) }}$ up to a certain constant and $γ$ scales as $n^{-1/(d+4β)}$. The analysis extends to cases where the manifold has a boundary and the data samples contain high-dimensional additive noise. Our results indicate that the kernel two-sample test has no curse-of-dimensionality when the data lie on or near a low-dimensional manifold. We validate our theory and the properties of the kernel test for manifold data through a series of numerical experiments.
