Stationary MMD Points for Cubature
Zonghao Chen, Toni Karvonen, Heishiro Kanagawa, François-Xavier Briol, Chris. J. Oates
TL;DR
This work addresses the problem of discretely approximating a target distribution $\mu$ with $n$ particles using the maximum mean discrepancy (MMD) as the guiding criterion. It introduces stationary MMD points, a relaxed and computable alternative to global MMD minimisers, and proves a surprising super-convergence: for functions $f$ in the RKHS $\mathcal{H}$, the cubature error when using stationary MMD points satisfies $|\frac{1}{n}\sum f(x_i) - \int f \, d\mu| = o(\operatorname{MMD}(\mu, \mu_n))$, with the cubature exact on $\mathcal{F}_n = \mathrm{span}\{1\} \cup \mathcal{G}_{\mathcal{X}_n}$. To compute these points practically, the paper develops a noisy MMD gradient-flow scheme that evolves particles toward stationarity; it provides a non-asymptotic, finite-particle error bound showing convergence rates balancing optimization and estimation errors. Empirically, stationary MMD points outperform several baselines on mixtures of Gaussians and OpenML datasets, and the results illuminate the impact of kernel choice and the role of gradient-flow dynamics. Overall, the combination of super-convergence theory and finite-particle convergence guarantees yields a robust, scalable framework for kernel-based cubature and coreset construction with broad applicability.
Abstract
Approximation of a target probability distribution using a finite set of points is a problem of fundamental importance, arising in cubature, data compression, and optimisation. Several authors have proposed to select points by minimising a maximum mean discrepancy (MMD), but the non-convexity of this objective precludes global minimisation in general. Instead, we consider \emph{stationary} points of the MMD which, in contrast to points globally minimising the MMD, can be accurately computed. Our main theoretical contribution is the (perhaps surprising) result that, for integrands in the associated reproducing kernel Hilbert space, the cubature error of stationary MMD points vanishes \emph{faster} than the MMD. Motivated by this \emph{super-convergence} property, we consider discretised gradient flows as a practical strategy for computing stationary points of the MMD, presenting a refined convergence analysis that establishes a novel non-asymptotic finite-particle error bound, which may be of independent interest.
