Optimizing Kernel Discrepancies via Subset Selection
Deyao Chen, François Clément, Carola Doerr, Nathan Kirk
TL;DR
This work extends subset selection from traditional $L_\infty$ discrepancy to kernel discrepancies, enabling efficient extraction of an $m$-point subset from a large pool of size $n$ for both uniform and nonuniform targets via $d^{\mathcal{H}}_{2,F}(P)$ and Kernel Stein Discrepancy (KSD). It introduces a swap-based, fast-update heuristic, leveraging a decomposition $d^{\mathcal{H}}_{2,F}(P)^2 = c + \sum_{i,j} V(i,j)$ and an auxiliary $B$-array to compute gains efficiently, with practical restart and initialization schemes to navigate local optima. The authors demonstrate, across $L_2$ star discrepancy and KSD objectives, that their kernel-based subset selection often matches or surpasses existing $L_\infty$-focused methods in higher dimensions and yields substantial improvements over Stein Points for nonuniform targets. The approach enables scalable, non-gradient optimization of discrepancy-based objectives and suggests practical applications such as MCMC thinning, while highlighting paths for exact formulations and addressing known KSD pathologies. Overall, the work advances principled, kernel-based design of low-discrepancy samples suitable for both classical QMC and density-targeted sampling tasks with broad practical impact.
Abstract
Kernel discrepancies are a powerful tool for analyzing worst-case errors in quasi-Monte Carlo (QMC) methods. Building on recent advances in optimizing such discrepancy measures, we extend the subset selection problem to the setting of kernel discrepancies, selecting an m-element subset from a large population of size $n \gg m$. We introduce a novel subset selection algorithm applicable to general kernel discrepancies to efficiently generate low-discrepancy samples from both the uniform distribution on the unit hypercube, the traditional setting of classical QMC, and from more general distributions $F$ with known density functions by employing the kernel Stein discrepancy. We also explore the relationship between the classical $L_2$ star discrepancy and its $L_\infty$ counterpart.
