Table of Contents
Fetching ...

Targeted Separation and Convergence with Kernel Discrepancies

Alessandro Barp, Carl-Johann Simon-Gabriel, Mark Girolami, Lester Mackey

TL;DR

The paper develops comprehensive conditions for when kernel-based discrepancies separate a target distribution $P$ from other measures and when they enforce convergence of sequences to $P$, addressing both MMDs and KSDs. By introducing Bochner embeddability, score-based separations, and $L^2$-ISPD criteria, it provides broad, verifiable criteria that extend beyond previous restrictive assumptions and result in the first KSDs that metrize weak convergence to $P$. It also shows how to construct bounded, convergence-controlling Stein kernels from standard translation-invariant bases via Schwarz tilting, enabling practical, SVGD-compatible convergence guarantees. The findings have immediate implications for hypothesis testing, sample quality measurement, and Stein variational methods, while outlining directions to generalize to other convergence notions. Overall, the work unifies separation and convergence control under kernel methods and opens pathways to robust, scalable inference with non-standard targets.

Abstract

Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii). For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels and for controlling convergence with bounded kernels. We use these results on $\mathbb{R}^d$ to substantially broaden the known conditions for KSD separation and convergence control and to develop the first KSDs known to exactly metrize weak convergence to P. Along the way, we highlight the implications of our results for hypothesis testing, measuring and improving sample quality, and sampling with Stein variational gradient descent.

Targeted Separation and Convergence with Kernel Discrepancies

TL;DR

The paper develops comprehensive conditions for when kernel-based discrepancies separate a target distribution from other measures and when they enforce convergence of sequences to , addressing both MMDs and KSDs. By introducing Bochner embeddability, score-based separations, and -ISPD criteria, it provides broad, verifiable criteria that extend beyond previous restrictive assumptions and result in the first KSDs that metrize weak convergence to . It also shows how to construct bounded, convergence-controlling Stein kernels from standard translation-invariant bases via Schwarz tilting, enabling practical, SVGD-compatible convergence guarantees. The findings have immediate implications for hypothesis testing, sample quality measurement, and Stein variational methods, while outlining directions to generalize to other convergence notions. Overall, the work unifies separation and convergence control under kernel methods and opens pathways to robust, scalable inference with non-standard targets.

Abstract

Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii). For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels and for controlling convergence with bounded kernels. We use these results on to substantially broaden the known conditions for KSD separation and convergence control and to develop the first KSDs known to exactly metrize weak convergence to P. Along the way, we highlight the implications of our results for hypothesis testing, measuring and improving sample quality, and sampling with Stein variational gradient descent.
Paper Structure (48 sections, 46 theorems, 126 equations)

This paper contains 48 sections, 46 theorems, 126 equations.

Key Result

Proposition 1

The following claims hold true.

Theorems & Definitions (62)

  • Definition 1: Maximum mean discrepancy (MMD)
  • Remark 1: Embeddability
  • Proposition 1: Embeddability conditions
  • Remark 2: Sufficient condition for separability
  • Proposition 2: MMD as a double integral
  • Definition 2: Kernel Stein discrepancy (KSD)
  • Remark 3: Relation to prior definitions of KSD
  • Theorem 1: KSD as MMD
  • Remark 4: Scalar kernel KSD
  • Corollary 1: KSD as a double integral
  • ...and 52 more