Targeted Separation and Convergence with Kernel Discrepancies
Alessandro Barp, Carl-Johann Simon-Gabriel, Mark Girolami, Lester Mackey
TL;DR
The paper develops comprehensive conditions for when kernel-based discrepancies separate a target distribution $P$ from other measures and when they enforce convergence of sequences to $P$, addressing both MMDs and KSDs. By introducing Bochner embeddability, score-based separations, and $L^2$-ISPD criteria, it provides broad, verifiable criteria that extend beyond previous restrictive assumptions and result in the first KSDs that metrize weak convergence to $P$. It also shows how to construct bounded, convergence-controlling Stein kernels from standard translation-invariant bases via Schwarz tilting, enabling practical, SVGD-compatible convergence guarantees. The findings have immediate implications for hypothesis testing, sample quality measurement, and Stein variational methods, while outlining directions to generalize to other convergence notions. Overall, the work unifies separation and convergence control under kernel methods and opens pathways to robust, scalable inference with non-standard targets.
Abstract
Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii). For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels and for controlling convergence with bounded kernels. We use these results on $\mathbb{R}^d$ to substantially broaden the known conditions for KSD separation and convergence control and to develop the first KSDs known to exactly metrize weak convergence to P. Along the way, we highlight the implications of our results for hypothesis testing, measuring and improving sample quality, and sampling with Stein variational gradient descent.
