Optimizing Kernel Discrepancies via Subset Selection

Deyao Chen; François Clément; Carola Doerr; Nathan Kirk

Optimizing Kernel Discrepancies via Subset Selection

Deyao Chen, François Clément, Carola Doerr, Nathan Kirk

TL;DR

This work extends subset selection from traditional $L_\infty$ discrepancy to kernel discrepancies, enabling efficient extraction of an $m$-point subset from a large pool of size $n$ for both uniform and nonuniform targets via $d^{\mathcal{H}}_{2,F}(P)$ and Kernel Stein Discrepancy (KSD). It introduces a swap-based, fast-update heuristic, leveraging a decomposition $d^{\mathcal{H}}_{2,F}(P)^2 = c + \sum_{i,j} V(i,j)$ and an auxiliary $B$-array to compute gains efficiently, with practical restart and initialization schemes to navigate local optima. The authors demonstrate, across $L_2$ star discrepancy and KSD objectives, that their kernel-based subset selection often matches or surpasses existing $L_\infty$-focused methods in higher dimensions and yields substantial improvements over Stein Points for nonuniform targets. The approach enables scalable, non-gradient optimization of discrepancy-based objectives and suggests practical applications such as MCMC thinning, while highlighting paths for exact formulations and addressing known KSD pathologies. Overall, the work advances principled, kernel-based design of low-discrepancy samples suitable for both classical QMC and density-targeted sampling tasks with broad practical impact.

Abstract

Kernel discrepancies are a powerful tool for analyzing worst-case errors in quasi-Monte Carlo (QMC) methods. Building on recent advances in optimizing such discrepancy measures, we extend the subset selection problem to the setting of kernel discrepancies, selecting an m-element subset from a large population of size $n \gg m$. We introduce a novel subset selection algorithm applicable to general kernel discrepancies to efficiently generate low-discrepancy samples from both the uniform distribution on the unit hypercube, the traditional setting of classical QMC, and from more general distributions $F$ with known density functions by employing the kernel Stein discrepancy. We also explore the relationship between the classical $L_2$ star discrepancy and its $L_\infty$ counterpart.

Optimizing Kernel Discrepancies via Subset Selection

TL;DR

This work extends subset selection from traditional

discrepancy to kernel discrepancies, enabling efficient extraction of an

-point subset from a large pool of size

for both uniform and nonuniform targets via

and Kernel Stein Discrepancy (KSD). It introduces a swap-based, fast-update heuristic, leveraging a decomposition

and an auxiliary

-array to compute gains efficiently, with practical restart and initialization schemes to navigate local optima. The authors demonstrate, across

star discrepancy and KSD objectives, that their kernel-based subset selection often matches or surpasses existing

-focused methods in higher dimensions and yields substantial improvements over Stein Points for nonuniform targets. The approach enables scalable, non-gradient optimization of discrepancy-based objectives and suggests practical applications such as MCMC thinning, while highlighting paths for exact formulations and addressing known KSD pathologies. Overall, the work advances principled, kernel-based design of low-discrepancy samples suitable for both classical QMC and density-targeted sampling tasks with broad practical impact.

Abstract

. We introduce a novel subset selection algorithm applicable to general kernel discrepancies to efficiently generate low-discrepancy samples from both the uniform distribution on the unit hypercube, the traditional setting of classical QMC, and from more general distributions

with known density functions by employing the kernel Stein discrepancy. We also explore the relationship between the classical

star discrepancy and its

counterpart.

Optimizing Kernel Discrepancies via Subset Selection

TL;DR

Abstract

Optimizing Kernel Discrepancies via Subset Selection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)

Theorems & Definitions (1)